stepincto-smams

v0.1.1

Published

9 months ago

Types for the Sheetmuse Annotated Metadata Standard

Downloads

0High
0Medium
0Low

stepincto

stepincto-smams

TypeScript types for the SMAMS (Sheetmuse Music Annotation Metadata Standard) - a standardized schema for audio annotation data supporting hierarchical analysis of music, speech, and general audio content.

Overview

SMAMS provides structured containers for temporal annotations with nested namespaces, enabling complex multi-level analysis like:

Speech: Utterance → Word → Phoneme
Music: Song → Section → Bar → Beat → Note
Audio Events: Scene → Event → Sub-event

The schema supports rich metadata tracking, pipeline provenance, and temporal validation, making it ideal for research, production audio analysis, and multi-modal AI systems.

Installation

npm install stepincto-smams

Quick Start

import type { SMAMS, Observation, Namespace } from "stepincto-smams";
import fs from "fs";

// Load SMAMS data
const smamsData = JSON.parse(fs.readFileSync("audio-annotations.smams", "utf8")) as SMAMS;

// Access metadata
console.log(smamsData.metadata.audio.title);
console.log(smamsData.metadata.file.duration);

// Access annotations
const wordNamespace = smamsData.annotations.find(ns => ns.namespace === "word");
if (wordNamespace?.data && Array.isArray(wordNamespace.data)) {
  wordNamespace.data.forEach(obs => {
    console.log(`${obs.value} at ${obs.interval.time}s`);
  });
}

Core Type System

Root Container

`SMAMS`

The main container for all audio annotation data.

interface SMAMS {
  schema_version?: string;           // Version of SMAMS schema
  metadata: SMAMSMetadata;          // Comprehensive metadata
  annotations: Namespace[];         // Top-level namespaces
}

Metadata Types

`SMAMSMetadata`

Comprehensive metadata container aggregating all information about the file and annotation process.

interface SMAMSMetadata {
  file: FileMetadata;              // Technical file information
  source: SourceMetadata | null;   // Where audio was obtained
  audio: AudioMetadata | null;     // Content metadata
  annotation?: AnnotationMetadata; // Pipeline information
  sandbox?: SandboxMetadata;       // Custom extensions
}

`FileMetadata`

Technical metadata about the audio file.

interface FileMetadata {
  id: string;                      // Unique file identifier
  path?: string | null;            // File system path
  duration?: number | null;        // Length in seconds
  sample_rate?: number | null;     // Hz (must be power of 2)
  extension?: string | null;       // File format (mp3, wav, etc.)
  channels?: number | null;        // Number of audio channels
}

`AudioMetadata`

Descriptive information about the audio content.

interface AudioMetadata {
  title: string;                   // Track title
  artist?: string | null;          // Artist/performer name
  album?: string | null;           // Album/collection name
  genres?: string[] | null;        // Musical/content genres
  release_date?: string | null;    // ISO format (YYYY-MM-DD)
}

`SourceMetadata`

Information about where and when the audio was obtained.

interface SourceMetadata {
  id?: string | null;              // External platform ID
  downloaded_at?: string | null;   // Download timestamp
  source_id?: string | null;       // Platform identifier (youtube, spotify, etc.)
}

`AnnotationMetadata`

Details about the annotation pipeline and processing environment.

interface AnnotationMetadata {
  pipeline_name: string;           // Name of annotation pipeline
  hostname: string;                // Processing machine name
  created_at: string;              // Creation timestamp
  steps: AnnotationSource[];       // Pipeline steps (min 1)
}

`AnnotationSource`

Information about a model or tool used in the pipeline.

interface AnnotationSource {
  model_id: string;                // Model/tool identifier
  git_revision?: string | null;    // Git commit for reproducibility
  task?: string | null;            // Analysis task type
}

`SandboxMetadata`

Arbitrary fields for future extensions and experimentation.

interface SandboxMetadata {
  fields: { [k: string]: unknown }; // Custom extension fields
}

Core Annotation Types

`Namespace`

Container organizing observations of the same type with optional metadata.

interface Namespace {
  namespace: string;               // Type identifier (word, note, etc.)
  data: ObservationList | Observation[] | Record<string, unknown>[];
  metadata?: Record<string, unknown> | ModelMetadata | null;
  type?: string | null;            // Optional type hint
}

`Observation`

Basic annotation unit containing temporal location, value, and confidence.

interface Observation {
  interval: TimeInterval;          // Temporal boundaries
  value: string | number | boolean; // Observed content
  confidence?: number | null;      // Model confidence (0.0-1.0)
  annotations?: Namespace[] | null; // Nested annotations
}

`TimeInterval`

Represents temporal boundaries with start time and duration.

interface TimeInterval {
  time: number;                    // Start time in seconds
  duration: number;                // Duration in seconds
}

`ObservationList`

Validated container for time-ordered observations.

interface ObservationList {
  observations: Observation[];     // Time-ordered observation list
}

Hierarchical Structure Examples

Speech Transcription

// Phrase level
const phraseObs: Observation = {
  interval: { time: 10.0, duration: 3.5 },
  value: "Hello world how are you",
  annotations: [wordNamespace] // Contains word-level observations
};

// Word level (nested under phrase)
const wordObs: Observation = {
  interval: { time: 10.0, duration: 0.8 },
  value: "Hello",
  annotations: [phonemeNamespace] // Contains phoneme-level observations
};

Music Analysis

// Song section
const sectionObs: Observation = {
  interval: { time: 0.0, duration: 32.0 },
  value: "verse_1",
  annotations: [barNamespace] // Contains bar-level observations
};

// Musical note
const noteObs: Observation = {
  interval: { time: 1.25, duration: 0.5 },
  value: "C4",
  confidence: 0.92
};

Usage Patterns

Type-Safe Data Access

import type { SMAMS, Namespace, Observation } from "stepincto-smams";

function extractWords(smams: SMAMS): string[] {
  const wordNamespace = smams.annotations.find(ns => ns.namespace === "word");
  
  if (!wordNamespace?.data || !Array.isArray(wordNamespace.data)) {
    return [];
  }
  
  return wordNamespace.data
    .map(obs => typeof obs.value === 'string' ? obs.value : String(obs.value))
    .filter(Boolean);
}

Working with Nested Annotations

function getPhraseWords(phraseObs: Observation): Observation[] {
  const wordNamespace = phraseObs.annotations?.find(ns => ns.namespace === "word");
  
  if (!wordNamespace?.data || !Array.isArray(wordNamespace.data)) {
    return [];
  }
  
  return wordNamespace.data;
}

Metadata Validation

function validateAudioFile(metadata: SMAMSMetadata): boolean {
  return !!(
    metadata.file.id &&
    metadata.file.duration &&
    metadata.file.duration > 0 &&
    metadata.audio?.title
  );
}

Demo Data

The package includes a comprehensive demo SMAMS file for testing and development:

import type { SMAMS } from "stepincto-smams";
import fs from "fs";
import { fileURLToPath } from "url";
import path from "path";

// Load demo data
const demoPath = path.resolve(
  path.dirname(fileURLToPath(import.meta.url)), 
  "node_modules/stepincto-smams/demo.smams"
);
const demoData = JSON.parse(fs.readFileSync(demoPath, "utf8")) as SMAMS;

// Explore demo content
console.log("Title:", demoData.metadata.audio?.title);        // "Uptown Girl"
console.log("Artist:", demoData.metadata.audio?.artist);      // "Westlife"
console.log("Duration:", demoData.metadata.file.duration);    // 187.21 seconds
console.log("Namespaces:", demoData.annotations.length);      // 4 namespaces

Demo Data Contents

The demo includes:

Audio metadata: "Uptown Girl" by Westlife from Greatest Hits (2000)
File metadata: 187 seconds, 48kHz, stereo MP3
Multiple namespaces:
- phrase_word_namespace: Phrase-level transcription
- phrase_aligned_lyrics: Phrase-aligned lyrics with timing
- word_aligned_lyrics: Word-level lyrics with precise timing
- notes: Musical note detection data

Integration with Build Tools

TypeScript Configuration

Ensure your tsconfig.json includes:

{
  "compilerOptions": {
    "strict": true,
    "moduleResolution": "node",
    "esModuleInterop": true
  }
}

Runtime Validation

For runtime type checking, consider using libraries like zod or io-ts:

import { z } from 'zod';

const TimeIntervalSchema = z.object({
  time: z.number().min(0),
  duration: z.number().positive()
});

const ObservationSchema = z.object({
  interval: TimeIntervalSchema,
  value: z.union([z.string(), z.number(), z.boolean()]),
  confidence: z.number().min(0).max(1).optional()
});

Common Use Cases

1. Speech Analysis

Transcription with word-level timing
Speaker diarization with confidence scores
Phoneme-level analysis for pronunciation

2. Music Information Retrieval

Note transcription with pitch and timing
Chord progression analysis
Beat and tempo tracking

3. Audio Event Detection

Environmental sound classification
Audio scene analysis
Multi-label audio tagging

4. Multi-Modal Analysis

Synchronized audio-visual analysis
Cross-modal alignment
Temporal correspondence mapping

Error Handling

function safeLoadSMAMS(filePath: string): SMAMS | null {
  try {
    const data = JSON.parse(fs.readFileSync(filePath, "utf8"));
    
    // Basic structure validation
    if (!data.metadata || !data.annotations || !Array.isArray(data.annotations)) {
      console.error("Invalid SMAMS structure");
      return null;
    }
    
    return data as SMAMS;
  } catch (error) {
    console.error("Failed to load SMAMS file:", error);
    return null;
  }
}

Related Projects

SMAMS Python Library: Main SMAMS implementation
JAMS: JSON Annotated Music Specification (inspiration for SMAMS)

License

ISC

Contributing

For issues, feature requests, or contributions to the SMAMS standard itself, please visit the main repository.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

stepincto-smams

Overview

Installation

Quick Start

Core Type System

Root Container

SMAMS

Metadata Types

SMAMSMetadata

FileMetadata

AudioMetadata

SourceMetadata

AnnotationMetadata

AnnotationSource

SandboxMetadata

Core Annotation Types

Namespace

Observation

TimeInterval

ObservationList

Hierarchical Structure Examples

Speech Transcription

Music Analysis

Usage Patterns

Type-Safe Data Access

Working with Nested Annotations

Metadata Validation

Demo Data

Demo Data Contents

Integration with Build Tools

TypeScript Configuration

Runtime Validation

Common Use Cases

1. Speech Analysis

2. Music Information Retrieval

3. Audio Event Detection

4. Multi-Modal Analysis

Error Handling

Related Projects

License

Contributing

`SMAMS`

`SMAMSMetadata`

`FileMetadata`

`AudioMetadata`

`SourceMetadata`

`AnnotationMetadata`

`AnnotationSource`

`SandboxMetadata`

`Namespace`

`Observation`

`TimeInterval`

`ObservationList`