@relay-science/vector

v0.1.0

Published

a month ago

Headless TypeScript library for DNA/RNA sequence editing, analysis, and visualization

Downloads

625

0High
0Medium
0Low

relay-science

biovector

Headless TypeScript library for DNA/RNA sequence editing, analysis, and visualization. Zero runtime dependencies, fully tree-shakeable.

Built to power the genomics/molecular biology capabilities in biom.

Install

From biom (or any sibling project using file: references):

"dependencies": {
  "biovector": "file:../biovector"
}

Then bun install or npm install.

Quick Start

import {
  createSequence,
  reverseComplement,
  parseGenBank,
  digestSequence,
  detectORFs,
  designPrimers,
  needlemanWunsch,
} from 'biovector';

// Create a sequence
const seq = createSequence('ATGCGATCGATCG', { name: 'my-insert', isCircular: false });

// Reverse complement
const rc = reverseComplement(seq);

// Digest with restriction enzymes
const fragments = digestSequence(seq, ['EcoRI', 'BamHI']);

// Detect open reading frames (all 6 frames)
const orfs = detectORFs(seq);

// Design primers for a target region
const primers = designPrimers(seq, { targetStart: 100, targetEnd: 500 });

// Pairwise alignment
const alignment = needlemanWunsch('ATGCGATCG', 'ATGCAATCG');

Subpath Imports

Each module is available as a subpath export for smaller bundles:

import { createSequence } from 'biovector/core';
import { needlemanWunsch } from 'biovector/alignment';
import { parseGenBank } from 'biovector/parsers';
import { designGRNAs } from 'biovector/crispr';

Modules

`core` — Data Model & Sequence Manipulation

createSequence(), reverseComplement(), complement(), transcribe()
insertBases(), deleteBases(), getSubsequence()
gcContent(), molecularWeight(), validateSequence()
Circular-aware range arithmetic (rangesOverlap, splitAtOrigin, adjustRangeForInsert, etc.)
Generic undo/redo history (createHistory, pushState, undo, redo)
Full IUPAC codes, codon tables, nearest-neighbor thermodynamic parameters, BLOSUM62

`enzymes` — Restriction Enzyme Engine

150+ restriction enzymes with recognition sites, cut offsets, and overhang types
findCutSites() — single or multi-enzyme cut site detection (circular-aware)
digestSequence() — multi-enzyme digest with fragment computation
computeGelPositions() — virtual gel electrophoresis (log-scale band positions)
IUPAC ambiguity code support in recognition sites

`orf` — Open Reading Frame Detection

detectORFs() — all 6 reading frames, configurable start/stop codons, circular support
translateDNA() — DNA→protein translation
translateAllFrames() — all 6 frames at once
Internal start codon tracking

`primers` — Primer Design

designPrimers() — automated primer pair generation for a target region
calculateTm() — SantaLucia 1998 nearest-neighbor melting temperature
validatePrimer() — GC%, self-complementarity, hairpin, dimer, 3'-end stability checks
Salt-corrected Tm calculations

`alignment` — Sequence Alignment

needlemanWunsch() — global pairwise alignment (affine gap penalties)
smithWaterman() — local pairwise alignment
generateDotPlot() — dot plot matrix with configurable window/threshold
multipleSequenceAlignment() — progressive MSA via UPGMA guide tree
Substitution matrices: BLOSUM62, PAM250, simple DNA match/mismatch

`analysis` — Sequence Analysis

calculateGCContent() — sliding window GC% for plotting
detectCpGIslands() — Gardiner-Garden & Frommer criteria
calculateCpGFrequency() — CpG/GpC observed/expected ratios
analyzeComposition() — nucleotide/dinucleotide/amino acid composition
Basic secondary structure prediction

`codons` — Codon Tools

Species-specific codon usage tables (human, mouse, E. coli, yeast, Drosophila, C. elegans)
optimizeCodons() — codon optimization with GC balancing and restriction site avoidance
calculateCAI() — Codon Adaptation Index

`crispr` — CRISPR & RNAi Design

designGRNAs() — PAM scanning (SpCas9 NGG + configurable), on-target scoring
designShRNAs() — siRNA/shRNA target selection with knockdown scoring

`calculators` — Bio Calculators

calculateMolecularWeight() — DNA/RNA/protein MW from sequence
calculateConcentration() — OD260 → concentration (Beer-Lambert)
calculateDilution() — C1V1 = C2V2
estimateCellDensity() — OD600 → cells/mL

`parsers` — File Format I/O

parseGenBank() / writeGenBank() — full GenBank format (LOCUS, FEATURES, ORIGIN)
parseFasta() / writeFasta() — multi-sequence FASTA
parseSnapGene() — SnapGene .dna binary format
parseAB1() — AB1/ABIF chromatogram traces (raw signal, base calls, quality scores)
parseSBOL() / writeSBOL() — SBOL XML
autoDetectFormat() — format detection from magic bytes/headers

`ncbi` — NCBI Integration

ncbiSearch() / ncbiFetch() — E-utilities API client
autoAnnotateFromNCBI() — fetch gene records and convert to annotation features

`annotations` — Annotation Management

adjustAnnotationsForInsert() / adjustAnnotationsForDelete() — auto-adjust on edits
stackAnnotations() — minimal vertical row stacking for display
Feature CRUD, merge, split operations

`geometry` — Map Geometry (Headless)

computeCircularGeometry() — position→angle, SVG arc path generation
computeLinearGeometry() — position→pixel mapping, row layout
exportSVGPaths() — publication-quality SVG path data

Development

# Run tests
bun test           # or: npx vitest run

# Type check
bun run typecheck   # or: npx tsc --noEmit

# Build
bun run build       # or: npx tsc

# Clean build artifacts
bun run clean

Design Decisions

Zero runtime dependencies — all algorithms implemented from scratch for tree-shaking and minimal bundle size
Pure functions — all functions return new objects, never mutate inputs
0-based inclusive indexing — consistent with GenBank conventions
Circular-aware — all range/position operations handle origin-wrapping sequences
Web Worker friendly — no DOM dependencies in computation code, all functions are pure and transferable
Lazy enzyme database — enzyme definitions loaded on first access, not at import time

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

biovector

Install

Quick Start

Subpath Imports

Modules

core — Data Model & Sequence Manipulation

enzymes — Restriction Enzyme Engine

orf — Open Reading Frame Detection

primers — Primer Design

alignment — Sequence Alignment

analysis — Sequence Analysis

codons — Codon Tools

crispr — CRISPR & RNAi Design

calculators — Bio Calculators

parsers — File Format I/O

ncbi — NCBI Integration

annotations — Annotation Management

geometry — Map Geometry (Headless)