@rbalchii/anchor-tagwalker-wasm

v1.0.0

Published

3 months ago

Graph-based associative search with 70/30 budget split (STAR algorithm) (WASM)

0High
0Medium
0Low

rbalchii

search graph tag associative star wasm

anchor-tagwalker

Graph-based associative search implementing the STAR algorithm with 70/30 budget split

Features

STAR Algorithm: Semantic Temporal Associative Retrieval
70/30 Budget Split: Planets (direct matches) + Moons (graph discoveries)
Gravity Scoring: Unified field equation with temporal decay + SimHash similarity
Radial Inflation: Multi-hop tag walking with damping
Synonym Rings: Tag expansion for broader search coverage
Bipartite Graph: Atoms ↔ Tags structure for efficient traversal

Quick Start

use anchor_tagwalker::{TagWalker, TagWalkerConfig};

// Create walker
let mut walker = TagWalker::new();

// Add atoms with tags
walker.add_atom(1, "Rust programming", vec!["#rust".to_string(), "#programming".to_string()]);
walker.add_atom(2, "Python scripting", vec!["#python".to_string(), "#scripting".to_string()]);
walker.add_atom(3, "Rust performance", vec!["#rust".to_string(), "#performance".to_string()]);

// Configure search
let config = TagWalkerConfig::default();

// Search
let results = walker.search("#rust", &config);

for result in results {
    println!("Atom {}: relevance {:.3} ({:?})", 
             result.atom_id, result.relevance, result.result_type);
}

The STAR Algorithm

Gravity Equation

gravity = (shared_tags) × e^(-λ×Δt) × (1 - hamming_distance/64) × damping
               │              │              │            │
               │              │              │            └─ Multi-hop damping (0.85)
               │              │              └─ SimHash similarity (0.0-1.0)
               │              └─ Temporal decay (λ=0.00001)
               └─ Tag association count

Search Phases

Planets (70% budget): Direct FTS matches on content/tags
Moons (30% budget): Graph-discovered associations via shared tags

Configuration

use anchor_tagwalker::TagWalkerConfig;

// Default config (70/30 split, λ=0.00001, damping=0.85)
let config = TagWalkerConfig::default();

// Quick search (fewer results, single hop)
let config = TagWalkerConfig::quick();

// Deep exploration (more results, 3 hops)
let config = TagWalkerConfig::deep();

// Custom config
let config = TagWalkerConfig::new()
    .with_planet_budget(0.8)
    .with_moon_budget(0.2)
    .with_max_results(100)
    .with_max_hops(2)
    .with_damping(0.9);

API

TagWalker

pub struct TagWalker {
    // Bipartite graph: atoms ↔ tags
}

impl TagWalker {
    pub fn new() -> Self;
    pub fn add_atom(&mut self, id: u64, content: &str, tags: Vec<String>);
    pub fn search(&self, query: &str, config: &TagWalkerConfig) -> Vec<SearchResult>;
    pub fn search_with_budget(&self, query: &str, config: &TagWalkerConfig, total_tokens: usize) -> ContextPackage;
}

SearchResult

pub struct SearchResult {
    pub atom_id: u64,           // Atom ID
    pub relevance: f32,         // Gravity score
    pub matched_tags: Vec<String>, // Tags that matched
    pub result_type: ResultType,   // Planet or Moon
    pub path: Vec<String>,      // Path from query (for moons)
}

pub enum ResultType {
    Planet,  // Direct FTS match
    Moon,    // Graph-discovered
}

Budget Allocation

use anchor_tagwalker::{TagWalker, TagWalkerConfig, BudgetAllocator};

let mut walker = TagWalker::new();
// ... add atoms ...

let config = TagWalkerConfig::default();
let package = walker.search_with_budget("#rust", &config, 8192);

println!("Planets: {}", package.planets.len());
println!("Moons: {}", package.moons.len());
println!("Budget used: {} / {}", package.tokens_used, package.budget_limit);
println!("Utilization: {:.1}%", package.utilization() * 100.0);

Synonym Ring

use anchor_tagwalker::TagWalker;
use std::collections::HashMap;

let mut walker = TagWalker::new();

// Load synonym ring
let mut ring = HashMap::new();
ring.insert("#rust".to_string(), vec![
    "#programming".to_string(),
    "#systems".to_string(),
    "#performance".to_string(),
]);

walker.set_synonym_ring(ring);

// Search for "#rust" will also match "#programming", etc.
let results = walker.search("#rust", &TagWalkerConfig::default());

Installation

[dependencies]
anchor-tagwalker = "0.1.0"

Or:

cargo add anchor-tagwalker

Usage Examples

Building a Knowledge Search Engine

use anchor_tagwalker::{TagWalker, TagWalkerConfig};

fn build_knowledge_index() -> TagWalker {
    let mut walker = TagWalker::new();
    
    // Add your knowledge atoms
    walker.add_atom(1, "Rust ownership system", vec!["#rust", "#memory"]);
    walker.add_atom(2, "Python decorators", vec!["#python", "#metaprogramming"]);
    walker.add_atom(3, "Rust borrowing", vec!["#rust", "#references"]);
    
    walker
}

fn search_knowledge(walker: &TagWalker, query: &str) {
    let config = TagWalkerConfig::quick();
    let results = walker.search(query, &config);
    
    println!("Search results for '{}':", query);
    for result in results {
        let kind = match result.result_type {
            anchor_tagwalker::ResultType::Planet => "🪐",
            anchor_tagwalker::ResultType::Moon => "🌙",
        };
        println!("  {} Atom {}: {:.3}", kind, result.atom_id, result.relevance);
    }
}

Multi-Hop Discovery

use anchor_tagwalker::{TagWalker, TagWalkerConfig};

let mut walker = TagWalker::new();

// Create interconnected graph
walker.add_atom(1, "Rust", vec!["#rust", "#lang"]);
walker.add_atom(2, "Programming", vec!["#lang", "#coding"]);
walker.add_atom(3, "Software", vec!["#coding", "#engineering"]);
walker.add_atom(4, "Engineering", vec!["#engineering", "#math"]);

// Deep search (3 hops)
let config = TagWalkerConfig::deep();
let results = walker.search("#rust", &config);

// Will discover atoms beyond direct connections

Budget-Aware Context Assembly

use anchor_tagwalker::{TagWalker, TagWalkerConfig};

let walker = TagWalker::new();
let config = TagWalkerConfig::default();

// Allocate 4096 tokens for context
let package = walker.search_with_budget("#query", &config, 4096);

// Access results
for planet in &package.planets {
    println!("Planet: {:.3}", planet.relevance);
}

for moon in &package.moons {
    println!("Moon: {:.3}", moon.relevance);
}

// Check budget
println!("Used {} / {} tokens", package.tokens_used, package.budget_limit);

Architecture

Query
  │
  ▼
┌─────────────────┐
│ Expand Query    │  Synonym ring expansion
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Find Planets    │  Direct FTS matches (70% budget)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Find Moons      │  Graph walk via shared tags
│ (Gravity Score) │  gravity = tags × decay × sim × damping
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Radial Inflation│  Multi-hop walking (if configured)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Budget Allocate │  Respect 70/30 split
└────────┬────────┘
         │
         ▼
    Search Results

Testing

cargo test --all-features

Benchmarks

cargo bench

Sample output:

search_100_atoms          time:   [50.0 µs 52.0 µs 54.0 µs]
search_with_synonyms      time:   [60.0 µs 62.0 µs 64.0 µs]
radial_inflation          time:   [150.0 µs 155.0 µs 160.0 µs]
add_atom                  time:   [2.0 µs 2.1 µs 2.2 µs]

Performance Targets

| Metric | Target | |--------|--------| | Search p95 (100 atoms) | ≤200ms | | Gravity calculation | ≤1µs per atom | | Radial inflation (3 hops) | ≤500ms | | Memory usage | <10MB for 10k atoms |

License

AGPL-3.0 - See LICENSE for details.

Contributing

Read the specification
Follow code style
Write tests per testing standards
Submit a PR

Acknowledgments

STAR Algorithm: Original research from Anchor OS
SimHash: Moses Charikar (1997)
Graph traversal: Bipartite graph theory
Weighted reservoir sampling: Efraimidis-Spirakis algorithm

Dependencies

anchor-fingerprint: SimHash for similarity scoring
anchor-keyextract: Synonym ring support
rand: Random sampling for serendipity
serde: Serialization