signalk-parquet
v0.6.5-beta.2
Published
SignalK plugin and webapp that archives SK data to Parquet files with a regimen control system, advanced querying, Claude integrated AI analysis, spatial capabilities, and REST API.
Maintainers
Readme
SignalK Parquet Data Store
A comprehensive SignalK plugin and webapp that saves SignalK data directly to Parquet files with manual and automated regimen-based archiving and advanced querying features, including a REST API built on the SignalK History API, Claude AI history data analysis, and spatial geographic analysis capabilities.
Features
Core Data Management
- Smart Data Types: Intelligent Parquet schema detection preserves native data types (DOUBLE, BOOLEAN) instead of forcing everything to strings
- Multiple File Formats: Support for Parquet, JSON, and CSV output formats (querying in parquet only)
- Daily Consolidation: Automatic daily file consolidation with S3 upload capabilities
- Near Real-time Buffering: Efficient data buffering with configurable thresholds
Data Validation & Schema Repair
- NEW Schema Validation: Comprehensive validation of Parquet file schemas against SignalK metadata standards
- NEW Automated Repair: One-click repair of schema violations with proper data type conversion
- NEW Type Correction: Automatic conversion of incorrectly stored data types (e.g., numeric strings → DOUBLE, boolean strings → BOOLEAN)
- NEW Metadata Integration: Uses SignalK metadata (units, types) to determine correct data types for marine measurements
- NEW Safe Operations: Creates backups before repair and quarantines corrupted files for safety
- NEW Progress Tracking: Real-time progress monitoring with cancellation support for large datasets
Benefits of Proper Data Types
Using correct data types in Parquet files provides significant advantages:
- Storage Efficiency: Numeric data stored as DOUBLE uses ~50% less space than string representations
- Query Performance: Native numeric operations are 5-10x faster than string parsing during analysis
- Data Integrity: Type validation prevents data corruption and ensures consistent analysis results
- Analytics Compatibility: Proper types enable advanced statistical analysis and machine learning applications
- Compression: Parquet's columnar compression works optimally with correctly typed data
Validation Process
The validation system checks each Parquet file for:
- Field Type Consistency: Ensures numeric marine data (position, speed, depth) is stored as DOUBLE
- Boolean Representation: Validates true/false values are stored as BOOLEAN, not strings
- Metadata Alignment: Compares file schemas against SignalK metadata for units like meters, volts, amperes
- Schema Standards: Enforces data best practices for long-term data integrity
Advanced Querying
- SignalK History API Compliance: Full compliance with SignalK History API specifications
- Standard Time Parameters: All 5 standard query patterns supported
- Time-Filtered Discovery: Paths and contexts filtered by time range
- Optional Analytics: Moving averages (EMA/SMA) available on demand
- 🔄 NEW: Automatic Unit Conversion: Optional integration with
signalk-units-preferenceplugin- Server-side conversion to user's preferred units (knots, km/h, °F, °C, etc.)
- Add
?convertUnits=trueto any history query - Respects all unit preferences configured in units-preference plugin
- Configurable cache (1-60 minutes) balances performance vs. responsiveness
- Conversion metadata included in response
- 🌍 NEW: Timezone Conversion: Convert UTC timestamps to local or specified timezone
- Add
?convertTimesToLocal=trueto convert timestamps to local time - Optional
&timezone=America/New_Yorkfor custom IANA timezone - Automatic daylight saving time handling
- Clean ISO 8601 format with offset (e.g.,
2025-10-20T12:34:04-04:00)
- Add
- Flexible Time Querying: Multiple ways to specify time ranges
- Query from now, from specific times, or between time ranges
- Duration-based windows (1h, 30m, 2d) for easy relative queries
- Forward and backward time querying support
- Time Alignment: Automatic alignment of data from different sensors using time bucketing
- DuckDB Integration: Direct SQL querying of Parquet files with type-safe operations
- 🌍 Spatial Analysis: Advanced geographic analysis with DuckDB spatial extension
- Track Analysis: Calculate vessel tracks, distances, and movement patterns
- Proximity Detection: Multi-vessel distance calculations and collision risk analysis
- Geographic Visualization: Generate movement boundaries, centroids, and spatial statistics
- Route Planning: Historical track analysis for route optimization and performance analysis
Management & Control
- Command Management: Register, execute, and manage SignalK commands with automatic path configuration
- Regimen-Based Data Collection: Control data collection with command-based regimens
- Multi-Vessel Support: Wildcard vessel contexts (
vessels.*) with MMSI-based exclusion filtering - Source Filtering: Filter data by SignalK source labels (bypasses server arbitration for raw data access)
- Comprehensive REST API: Full programmatic control of queries and configuration
User Interface & Integration
- Responsive Web Interface: Complete web-based management interface
- S3 Integration: Upload files to Amazon S3 with configurable timing and conflict resolution
- Context Support: Support for multiple vessel contexts with exclusion controls
Regimen System (Advanced)
- Operational Context Tracking: Define regimens for operational states (mooring, anchoring, racing, passage-making)
- Command-Based Episodes: Track state transitions using SignalK commands as regimen triggers
- Keyword Mapping: Associate keywords with commands for intelligent Claude AI context matching
- Episode Boundary Detection: Sophisticated SQL-based detection of operational periods using CTEs and window functions
- Contextual Data Collection: Link SignalK paths to regimens for targeted data analysis during specific operations
- Web Interface Management: Create, edit, and manage regimens and command keywords through the web UI
NEW Threshold Automation
- NEW Per-Command Conditions: Each regimen/command can define one or more thresholds that watch a single SignalK path.
- NEW True-Only Actions: On every path update the condition is evaluated; when it is true the command is set to the threshold's
activateOnMatchstate (ON/OFF). False evaluations leave the command untouched, so use a second threshold if you want a different level to switch it back. - NEW Stable Triggers: Optional hysteresis (seconds) suppresses re-firing while the condition remains true, preventing rapid toggling in noisy data.
- NEW Multiple Thresholds Per Path: Unique monitor keys allow several thresholds to observe the same SignalK path without cancelling each other.
- NEW Unit Handling: Threshold values must match the live SignalK units (e.g., fractional 0–1 SoC values). Angular thresholds are entered in degrees in the UI and stored as radians automatically.
- NEW Automation State Machine: When enabling automation, command is set to OFF then all thresholds are immediately evaluated. When disabling automation, threshold monitoring stops and command state remains unchanged. Default state is hardcoded to OFF on server side.
Claude AI Integration
- AI-Powered Analysis: Advanced maritime data analysis using Claude AI models (Opus 4, Sonnet 4)
- Regimen-Based Analysis: Context-aware episode detection for operational states (mooring, anchoring, sailing)
- Command Integration: Keyword-based regimen matching with customizable command configurations
- Episode Detection: Sophisticated boundary detection for operational transitions )
- Multi-Vessel Support: Real-time data access from self vessel and other vessels via SignalK
- Conversation Continuity: Follow-up questions with preserved context and specialized tools
- Timezone Intelligence: Automatic UTC-to-local time conversion based on system timezone
- Custom Analysis: Create custom analysis prompts for specific operational needs
Requirements
Core Requirements
- SignalK Server v1.x or v2.x
- Node.js 18+ (included with SignalK)
Optional Plugin Integration
- signalk-units-preference (v0.7.0+): Required for automatic unit conversion feature
- Install from: https://github.com/motamman/signalk-units-preference
- Provides server-side unit conversion based on user preferences
- The history API will work without this plugin, but
convertUnits=truewill have no effect
Installation
Install from GitHub
# Navigate to folder
cd ~/.signalk/node_modules/
# Install from npm (after publishing)
npm install signalk-parquet
# Or install from GitHub
npm install motamman/signalk-parquet
cd ~/.signalk/node_modules/signalk-parquet
npm run build
# Restart SignalK
sudo systemctl restart signalk⚠️ IMPORTANT IF UPGRADING FROM 0.5.0-beta.3: Consolidation Bug Fix
THIS FIXES A RECURSIVE BUG THAT WAS CREATING NESTED PROCESSED DIRECTORIES AND REPEATEDLY PROCESSING THE SAME FILES. THIS SHOULD FIX THAT PROBLEM BUT ANY processed FOLDERS NESTED INSIDE A processed FOLDER SHOULD BE MANUALLY DELETED.
Cleaning Up Nested Processed Directories
No action is likely needed if upgrading from 0.5.0-beta.4 or better. If you're upgrading from a previous version, you may have nested processed directories that need cleanup:
# Check for nested processed directories
find data -name "*processed*" -type d | head -20
# See the deepest nesting levels
find data -name "*processed*" -type d | awk -F'/' '{print NF-1, $0}' | sort -nr | head -5
# Count files in nested processed directories
find data -path "*/processed/processed/*" -type f | wc -l
# Remove ALL nested processed directories (RECOMMENDED)
find data -name "processed" -type d -exec rm -rf {} +
# Verify cleanup completed
find data -path "*/processed/processed/*" -type f | wc -l # Should show 0Note: The processed directories only contain files that were moved during consolidation - removing them does not delete your original data.
Development Setup
# Clone or copy the signalk-parquet directory
cd signalk-parquet
# Install dependencies
npm install
# Build the TypeScript code
npm run build
# Copy to SignalK plugins directory
cp -r . ~/.signalk/node_modules/signalk-parquet/
# Restart SignalK
sudo systemctl restart signalkProduction Build
# Build for production
npm run build
# The compiled JavaScript will be in the dist/ directoryConfiguration
Plugin Configuration
Navigate to SignalK Admin → Server → Plugin Config → SignalK Parquet Data Store
Configure basic plugin settings (path configuration is managed separately in the web interface):
| Setting | Description | Default |
|---------|-------------|---------|
| Buffer Size | Number of records to buffer before writing | 1000 |
| Save Interval | How often to save buffered data (seconds) | 30 |
| Output Directory | Directory to save data files | SignalK data directory |
| Filename Prefix | Prefix for generated filenames | signalk_data |
| File Format | Output format (parquet, json, csv) | parquet |
| Retention Days | Days to keep processed files | 7 |
| Unit Conversion Cache Duration 🆕 | How long to cache unit conversions before reloading (minutes) | 5 |
Note: The Unit Conversion Cache Duration setting controls how quickly changes to unit preferences (in the signalk-units-preference plugin) are reflected in the history API. Lower values (1-2 minutes) reflect changes faster but use more resources. Higher values (30-60 minutes) reduce overhead but take longer to reflect changes. The default of 5 minutes provides a good balance for most users.
S3 Upload Configuration
Configure S3 upload settings in the plugin configuration:
| Setting | Description | Default |
|---------|-------------|---------|
| Enable S3 Upload | Enable uploading to Amazon S3 | false |
| Upload Timing | When to upload (realtime/consolidation) | consolidation |
| S3 Bucket | Name of S3 bucket | - |
| AWS Region | AWS region for S3 bucket | us-east-1 |
| Key Prefix | S3 object key prefix | - |
| Access Key ID | AWS credentials (optional) | - |
| Secret Access Key | AWS credentials (optional) | - |
| Delete After Upload | Delete local files after upload | false |
Claude AI Configuration
Configure Claude AI integration in the plugin configuration for advanced data analysis:
| Setting | Description | Default |
|---------|-------------|---------|
| Enable Claude Integration | Enable AI-powered data analysis | false |
| API Key | Anthropic Claude API key (required) | - |
| Model | Claude model to use for analysis | claude-3-7-sonnet-20250219 |
| Max Tokens | Maximum tokens for AI responses | 4000 |
| Temperature | AI creativity level (0-1) | 0.3 |
Supported Claude Models
| Model | Description | Use Case |
|-------|-------------|----------|
| claude-opus-4-1-20250805 | Latest Opus model - highest intelligence | Complex analysis, detailed insights |
| claude-opus-4-20250514 | Opus model - very high intelligence | Advanced analysis |
| claude-sonnet-4-20250514 | Sonnet model - balanced performance | Recommended default |
Getting a Claude API Key
- Visit Anthropic Console
- Create an account or sign in
- Navigate to API Keys section
- Generate a new API key
- Copy the key and paste it in the plugin configuration
Note: Claude AI analysis requires an active Anthropic API subscription. Usage is billed based on tokens consumed during analysis.
Path Configuration
Important: Path configuration is managed exclusively through the web interface, not in the SignalK admin interface. This provides a more intuitive interface for managing data collection paths.
Accessing Path Configuration
- Navigate to:
http://localhost:3000/plugins/signalk-parquet - Click the ⚙️ Path Configuration tab
Adding Data Paths
Use the web interface to configure which SignalK paths to collect:
- Click ➕ Add New Path
- Configure the path settings:
- SignalK Path: The SignalK data path (e.g.,
navigation.position) - Always Enabled: Collect data regardless of regimen state
- Regimen Control: Command name that controls collection
- Source Filter: Only collect from specific sources
- Context: SignalK context (
vessels.self,vessels.*, or specific vessel) - Exclude MMSI: For
vessels.*context, exclude specific MMSI numbers
- SignalK Path: The SignalK data path (e.g.,
- Click ✅ Add Path
Managing Existing Paths
- Edit Path: Click ✏️ Edit button to modify path settings
- Delete Path: Click 🗑️ Remove button to delete a path
- Refresh: Click 🔄 Refresh Paths to reload configuration
- Show/Hide Commands: Toggle button to show/hide command paths in the table
Command Management
The plugin streamlines command management with automatic path configuration:
- Register Command: Commands are automatically registered with enabled path configurations
- Start Command: Click Start button to activate a command regimen
- Stop Command: Click Stop button to deactivate a command regimen
- Remove Command: Click Remove button to delete a command and its path configuration
This eliminates the previous 3-step process of registering commands, adding paths, and enabling them separately.
Path Configuration Storage
Path configurations are stored separately from plugin configuration in:
~/.signalk/signalk-parquet/webapp-config.jsonThis allows for:
- Independent management of path configurations
- Better separation of concerns
- Easier backup and migration of path settings
- More intuitive web-based configuration interface
Regimen-Based Control
Regimens allow you to control data collection based on SignalK commands:
Example: Weather data collection with source filtering
{
"path": "environment.wind.angleApparent",
"enabled": false,
"regimen": "captureWeather",
"source": "mqtt-weatherflow-udp",
"context": "vessels.self"
}Note: Source filtering accesses raw data before SignalK server arbitration, allowing collection of data from specific sources that might otherwise be filtered out.
Multi-Vessel Example: Collect navigation data from all vessels except specific MMSI numbers
{
"path": "navigation.position",
"enabled": true,
"context": "vessels.*",
"excludeMMSI": ["123456789", "987654321"]
}Command Path: Command paths are automatically created when registering commands
{
"path": "commands.captureWeather",
"enabled": true,
"context": "vessels.self"
}This path will only collect data when the command commands.captureWeather is active.
TypeScript Architecture
Type Safety
The plugin uses comprehensive TypeScript interfaces:
interface PluginConfig {
bufferSize: number;
saveIntervalSeconds: number;
outputDirectory: string;
filenamePrefix: string;
fileFormat: 'json' | 'csv' | 'parquet';
paths: PathConfig[];
s3Upload: S3UploadConfig;
}
interface PathConfig {
path: string;
enabled: boolean;
regimen?: string;
source?: string;
context: string;
excludeMMSI?: string[];
}
interface DataRecord {
received_timestamp: string;
signalk_timestamp: string;
context: string;
path: string;
value: any;
source_label?: string;
meta?: string;
}Plugin State Management
The plugin maintains typed state:
interface PluginState {
unsubscribes: Array<() => void>;
dataBuffers: Map<string, DataRecord[]>;
activeRegimens: Set<string>;
subscribedPaths: Set<string>;
parquetWriter?: ParquetWriter;
s3Client?: any;
currentConfig?: PluginConfig;
}Express Router Types
API routes are fully typed:
router.get('/api/paths',
(_: TypedRequest, res: TypedResponse<PathsApiResponse>) => {
// Typed request/response handling
}
);Data Output Structure
File Organization
output_directory/
├── vessels/
│ └── self/
│ ├── navigation/
│ │ ├── position/
│ │ │ ├── signalk_data_20250716T120000.parquet
│ │ │ └── signalk_data_20250716_consolidated.parquet
│ │ └── speedOverGround/
│ └── environment/
│ └── wind/
│ └── angleApparent/
└── processed/
└── [moved files after consolidation]Data Schema
Each record contains:
| Field | Type | Description |
|-------|------|-------------|
| received_timestamp | string | When the plugin received the data |
| signalk_timestamp | string | Original SignalK timestamp |
| context | string | SignalK context (e.g., vessels.self) |
| path | string | SignalK path |
| value | DOUBLE/BOOLEAN/INT64/UTF8 | Smart typed values - numbers stored as DOUBLE, booleans as BOOLEAN, etc. |
| value_json | string | JSON representation for complex values |
| source | string | Complete source information |
| source_label | string | Source label |
| source_type | string | Source type |
| source_pgn | number | PGN number (if applicable) |
| meta | string | Metadata information |
Smart Data Types
The plugin now intelligently detects and preserves native data types:
- Numbers: Stored as
DOUBLE(floating point) orINT64(integers) - Booleans: Stored as
BOOLEAN - Strings: Stored as
UTF8 - Objects: Serialized to JSON and stored as
UTF8 - Mixed Types: Falls back to
UTF8when a path contains multiple data types
This provides better compression, faster queries, and proper type safety for data analysis.
Web Interface
Features
- Path Configuration: Manage data collection paths with multi-vessel support
- Command Management: Streamlined command registration and control
- Data Exploration: Browse available data paths
- SQL Queries: Execute DuckDB queries against Parquet files
- History API: Query historical data using SignalK History API endpoints
- S3 Status: Test S3 connectivity and configuration
- Responsive Design: Works on desktop and mobile
- MMSI Filtering: Exclude specific vessels from wildcard contexts
API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| /api/paths | GET | List available data paths |
| /api/files/:path | GET | List files for a path |
| /api/sample/:path | GET | Sample data from a path |
| /api/query | POST | Execute SQL query |
| /api/config/paths | GET/POST/PUT/DELETE | Manage path configurations |
| /api/test-s3 | POST | Test S3 connection |
| /api/health | GET | Health check |
| Claude AI Analysis API | | |
| /api/analyze | POST | Perform AI analysis on data |
| /api/analyze/templates | GET | Get available analysis templates |
| /api/analyze/followup | POST | Follow-up analysis questions |
| /api/analyze/history | GET | Get analysis history |
| /api/analyze/test-connection | POST | Test Claude API connection |
| SignalK History API | | |
| /signalk/v1/history/values | GET | SignalK History API - Get historical values |
| /signalk/v1/history/contexts | GET | SignalK History API - Get available contexts |
| /signalk/v1/history/paths | GET | SignalK History API - Get available paths |
DuckDB Integration
Query Examples
Basic Queries
-- Get latest 10 records from navigation position
SELECT * FROM read_parquet('/path/to/navigation/position/*.parquet', union_by_name=true)
ORDER BY received_timestamp DESC LIMIT 10;
-- Count total records
SELECT COUNT(*) FROM read_parquet('/path/to/navigation/position/*.parquet', union_by_name=true);
-- Filter by source
SELECT * FROM read_parquet('/path/to/environment/wind/*.parquet', union_by_name=true)
WHERE source_label = 'mqtt-weatherflow-udp'
ORDER BY received_timestamp DESC LIMIT 100;
-- Aggregate by hour
SELECT
DATE_TRUNC('hour', received_timestamp::timestamp) as hour,
AVG(value::double) as avg_value,
COUNT(*) as record_count
FROM read_parquet('/path/to/data/*.parquet', union_by_name=true)
GROUP BY hour
ORDER BY hour;🌍 Spatial Analysis Queries
-- Calculate distance traveled over time
WITH ordered_positions AS (
SELECT
signalk_timestamp,
ST_Point(value_longitude, value_latitude) as position,
LAG(ST_Point(value_longitude, value_latitude)) OVER (ORDER BY signalk_timestamp) as prev_position
FROM read_parquet('data/vessels/urn_mrn_imo_mmsi_368396230/navigation/position/*.parquet', union_by_name=true)
WHERE signalk_timestamp >= '2025-09-27T16:00:00Z'
AND signalk_timestamp <= '2025-09-27T23:59:59Z'
AND value_latitude IS NOT NULL AND value_longitude IS NOT NULL
),
distances AS (
SELECT *,
CASE
WHEN prev_position IS NOT NULL
THEN ST_Distance_Sphere(position, prev_position)
ELSE 0
END as distance_meters
FROM ordered_positions
)
SELECT
strftime(date_trunc('hour', signalk_timestamp::TIMESTAMP), '%Y-%m-%dT%H:%M:%SZ') as time_bucket,
AVG(value_latitude) as avg_lat,
AVG(value_longitude) as avg_lon,
ST_AsText(ST_Centroid(ST_Collect(position))) as centroid,
SUM(distance_meters) as total_distance_meters,
COUNT(*) as position_records,
ST_AsText(ST_ConvexHull(ST_Collect(position))) as movement_area
FROM distances
GROUP BY time_bucket
ORDER BY time_bucket;
-- Multi-vessel proximity analysis
SELECT
v1.context as vessel1,
v2.context as vessel2,
ST_Distance_Sphere(
ST_Point(v1.value_longitude, v1.value_latitude),
ST_Point(v2.value_longitude, v2.value_latitude)
) as distance_meters,
v1.signalk_timestamp
FROM read_parquet('data/vessels/*/navigation/position/*.parquet', union_by_name=true) v1
JOIN read_parquet('data/vessels/*/navigation/position/*.parquet', union_by_name=true) v2
ON v1.signalk_timestamp = v2.signalk_timestamp AND v1.context != v2.context
WHERE v1.signalk_timestamp >= '2025-09-27T00:00:00Z'
AND ST_Distance_Sphere(
ST_Point(v1.value_longitude, v1.value_latitude),
ST_Point(v2.value_longitude, v2.value_latitude)
) < 1000 -- Within 1km
ORDER BY distance_meters;
-- Advanced movement analysis with bounding boxes
WITH ordered_positions AS (
SELECT
signalk_timestamp,
ST_Point(value_longitude, value_latitude) as position,
value_latitude,
value_longitude,
LAG(ST_Point(value_longitude, value_latitude)) OVER (ORDER BY signalk_timestamp) as prev_position,
strftime(date_trunc('hour', signalk_timestamp::TIMESTAMP), '%Y-%m-%dT%H:%M:%SZ') as time_bucket
FROM read_parquet('data/vessels/urn_mrn_imo_mmsi_368396230/navigation/position/*.parquet', union_by_name=true)
WHERE signalk_timestamp >= '2025-09-27T16:00:00Z'
AND signalk_timestamp <= '2025-09-27T23:59:59Z'
AND value_latitude IS NOT NULL AND value_longitude IS NOT NULL
),
distances AS (
SELECT *,
CASE
WHEN prev_position IS NOT NULL
THEN ST_Distance_Sphere(position, prev_position)
ELSE 0
END as distance_meters
FROM ordered_positions
)
SELECT
time_bucket,
AVG(value_latitude) as avg_lat,
AVG(value_longitude) as avg_lon,
-- Calculate bounding box manually
MIN(value_latitude) as min_lat,
MAX(value_latitude) as max_lat,
MIN(value_longitude) as min_lon,
MAX(value_longitude) as max_lon,
-- Distance and movement metrics
SUM(distance_meters) as total_distance_meters,
ROUND(SUM(distance_meters) / 1000.0, 2) as total_distance_km,
COUNT(*) as position_records,
-- Movement area approximation using bounding box
(MAX(value_latitude) - MIN(value_latitude)) * 111320 *
(MAX(value_longitude) - MIN(value_longitude)) * 111320 *
COS(RADIANS(AVG(value_latitude))) as approx_area_m2
FROM distances
GROUP BY time_bucket
ORDER BY time_bucket;Available Spatial Functions
ST_Point(longitude, latitude)- Create point geometriesST_Distance_Sphere(point1, point2)- Calculate distances in metersST_AsText(geometry)- Convert to Well-Known Text formatST_Centroid(ST_Collect(points))- Find center of multiple pointsST_ConvexHull(ST_Collect(points))- Create movement boundary polygons
History API Integration
The plugin provides full SignalK History API compliance, allowing you to query historical data using standard SignalK API endpoints with enhanced performance and filtering capabilities.
Available Endpoints
| Endpoint | Description | Parameters |
|----------|-------------|------------|
| /signalk/v1/history/values | Get historical values for specified paths | Standard patterns (see below)Optional: resolution, refresh, includeMovingAverages, useUTC |
| /signalk/v1/history/contexts | Get available vessel contexts for time range | Time Range: Any standard pattern (see below)Returns only contexts with data in specified range |
| /signalk/v1/history/paths | Get available SignalK paths for time range | Time Range: Any standard pattern (see below)Returns only paths with data in specified range |
Standard Time Range Patterns
The History API supports 5 standard SignalK time query patterns:
| Pattern | Parameters | Description | Example |
|---------|-----------|-------------|---------|
| 1 | duration | Query back from now | ?duration=1h |
| 2 | from + duration | Query forward from start | ?from=2025-01-01T00:00:00Z&duration=1h |
| 3 | to + duration | Query backward to end | ?to=2025-01-01T12:00:00Z&duration=1h |
| 4 | from | From start to now | ?from=2025-01-01T00:00:00Z |
| 5 | from + to | Specific range | ?from=2025-01-01T00:00:00Z&to=2025-01-02T00:00:00Z |
Legacy Support: The start parameter (used with duration) is deprecated but still supported for backward compatibility. A console warning will be shown. Use standard patterns instead.
Query Parameters
| Parameter | Description | Format | Examples |
|-----------|-------------|---------|----------|
| Required for /values: | | | |
| paths | SignalK paths with optional aggregation | path:method,path:method | navigation.position:first,wind.speed:average |
| Time Range: | Use one of the 5 standard patterns above | | |
| duration | Time period | [number][unit] | 1h, 30m, 15s, 2d |
| from | Start time (ISO 8601) | ISO datetime | 2025-01-01T00:00:00Z |
| to | End time (ISO 8601) | ISO datetime | 2025-01-01T06:00:00Z |
| Optional: | | | |
| context | Vessel context | vessels.self or vessels.<id> | vessels.self (default) |
| resolution | Time bucket size in milliseconds | Number | 60000 (1 minute buckets) |
| refresh | Enable auto-refresh (pattern 1 only) | true or 1 | refresh=true |
| includeMovingAverages | Include EMA/SMA calculations | true or 1 | includeMovingAverages=true |
| useUTC | Treat datetime inputs as UTC | true or 1 | useUTC=true |
| convertUnits | 🆕 Convert to preferred units (requires signalk-units-preference plugin) | true or 1 | convertUnits=true |
| convertTimesToLocal | 🆕 Convert timestamps to local/specified timezone | true or 1 | convertTimesToLocal=true |
| timezone | 🆕 IANA timezone ID (used with convertTimesToLocal) | IANA timezone | timezone=America/New_York |
| Deprecated: | | | |
| start | ⚠️ Use standard patterns instead | now or ISO datetime | Deprecated, use duration or from/to |
Query Examples
Pattern 1: Duration Only (Query back from now)
# Last hour of wind data
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent"
# Last 30 minutes with moving averages
curl "http://localhost:3000/signalk/v1/history/values?duration=30m&paths=environment.wind.speedApparent&includeMovingAverages=true"
# Real-time with auto-refresh
curl "http://localhost:3000/signalk/v1/history/values?duration=15m&paths=navigation.position&refresh=true"Pattern 2: From + Duration (Query forward)
# 6 hours forward from specific time
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&duration=6h&paths=navigation.position"Pattern 3: To + Duration (Query backward)
# 2 hours backward to specific time
curl "http://localhost:3000/signalk/v1/history/values?to=2025-01-01T12:00:00Z&duration=2h&paths=environment.wind.speedApparent"Pattern 4: From Only (From start to now)
# From specific time until now
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&paths=navigation.speedOverGround"Pattern 5: From + To (Specific range)
# Specific 24-hour period
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&to=2025-01-02T00:00:00Z&paths=navigation.position"Advanced Query Examples
Multiple paths with time alignment:
curl "http://localhost:3000/signalk/v1/history/values?duration=6h&paths=environment.wind.angleApparent,environment.wind.speedApparent,navigation.position&resolution=60000"Multiple aggregations of same path:
curl "http://localhost:3000/signalk/v1/history/values?from=2025-01-01T00:00:00Z&to=2025-01-01T06:00:00Z&paths=environment.wind.speedApparent:average,environment.wind.speedApparent:min,environment.wind.speedApparent:max&resolution=60000"With moving averages for trend analysis:
curl "http://localhost:3000/signalk/v1/history/values?duration=24h&paths=electrical.batteries.512.voltage&includeMovingAverages=true&resolution=300000"Different temporal samples:
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=navigation.position:first,navigation.position:middle_index,navigation.position:last&resolution=60000"Context and Path Discovery
Get contexts with data in last hour:
curl "http://localhost:3000/signalk/v1/history/contexts?duration=1h"Get contexts for specific time range:
curl "http://localhost:3000/signalk/v1/history/contexts?from=2025-01-01T00:00:00Z&to=2025-01-07T00:00:00Z"Get available paths with recent data:
curl "http://localhost:3000/signalk/v1/history/paths?duration=24h"Get all paths (no time filter):
curl "http://localhost:3000/signalk/v1/history/paths"Unit Conversion (NEW in v0.6.0)
Convert to user's preferred units:
# Speed in knots (if configured in signalk-units-preference)
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.speedOverGround&convertUnits=true"
# Wind speed in preferred units (knots, km/h, or mph)
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&convertUnits=true"
# Temperature in preferred units (°C or °F)
curl "http://localhost:3000/signalk/v1/history/values?duration=24h&paths=environment.outside.temperature&convertUnits=true"Response includes conversion metadata:
{
"values": [{"path": "navigation.speedOverGround", "method": "average"}],
"data": [["2025-10-20T16:12:14Z", 5.2]],
"units": {
"converted": true,
"conversions": [{
"path": "navigation.speedOverGround",
"baseUnit": "m/s",
"targetUnit": "knots",
"symbol": "kn"
}]
}
}Timezone Conversion (NEW in v0.6.0)
Convert to server's local time:
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=environment.wind.speedApparent&convertTimesToLocal=true"Convert to specific timezone:
# New York time (Eastern)
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.position&convertTimesToLocal=true&timezone=America/New_York"
# London time
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&convertTimesToLocal=true&timezone=Europe/London"
# Tokyo time
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=navigation.speedOverGround&convertTimesToLocal=true&timezone=Asia/Tokyo"Response includes timezone metadata:
{
"range": {
"from": "2025-10-20T12:12:19-04:00",
"to": "2025-10-20T13:12:19-04:00"
},
"data": [
["2025-10-20T12:12:14-04:00", 5.84],
["2025-10-20T12:12:28-04:00", 5.26]
],
"timezone": {
"converted": true,
"targetTimezone": "America/New_York",
"offset": "-04:00",
"description": "Converted to user-specified timezone: America/New_York (-04:00)"
}
}Combine both conversions:
# Convert values to knots AND timestamps to New York time
curl "http://localhost:3000/signalk/v1/history/values?duration=2d&paths=navigation.speedOverGround,environment.wind.speedApparent&convertUnits=true&convertTimesToLocal=true&timezone=America/New_York"Common IANA Timezone IDs:
America/New_York- Eastern Time (US)America/Chicago- Central Time (US)America/Denver- Mountain Time (US)America/Los_Angeles- Pacific Time (US)Europe/London- UKEurope/Paris- Central European TimeAsia/Tokyo- JapanPacific/Auckland- New ZealandAustralia/Sydney- Australian Eastern Time
Duration Formats
30s- 30 seconds15m- 15 minutes2h- 2 hours1d- 1 day
Timezone Handling (NEW)
Local time conversion (default behavior):
# 8:00 AM local time → automatically converted to UTC
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00&duration=1h&paths=navigation.position"UTC time mode:
# 8:00 AM UTC (not converted)
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00&duration=1h&paths=navigation.position&useUTC=true"Explicit timezone (always respected):
# Explicit UTC timezone
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00Z&duration=1h&paths=navigation.position"
# Explicit timezone offset
curl "http://localhost:3000/signalk/v1/history/values?context=vessels.self&start=2025-08-13T08:00:00-04:00&duration=1h&paths=navigation.position"Timezone behavior:
- Default (
useUTC=false): Datetime strings without timezone info are treated as local time and automatically converted to UTC - UTC mode (
useUTC=true): Datetime strings without timezone info are treated as UTC time - Explicit timezone: Strings with
Z,+HH:MM, or-HH:MMare always parsed as-is regardless ofuseUTCsetting start=now: Always uses current UTC time regardless ofuseUTCsetting
Get available contexts:
curl "http://localhost:3000/signalk/v1/history/contexts"Time Alignment and Bucketing
The History API automatically aligns data from different paths using time bucketing to solve the common problem of misaligned timestamps. This enables:
- Plotting: Data points align properly on charts
- Correlation: Compare values from different sensors at the same time
- Export: Clean, aligned datasets for analysis
Key Features:
- Smart Type Handling: Automatically handles numeric values (wind speed) and JSON objects (position)
- Robust Aggregation: Uses proper SQL type casting to prevent type errors
- Configurable Resolution: Time bucket size in milliseconds (default: auto-calculated based on time range)
- Multiple Aggregation Methods:
averagefor numeric data,firstfor complex objects
Parameters:
resolution- Time bucket size in milliseconds (default: auto-calculated)- Aggregation methods:
average,min,max,first,last,mid,middle_index
Aggregation Methods:
average- Average value in time bucket (default for numeric data)min- Minimum value in time bucketmax- Maximum value in time bucketfirst- First value in time bucket (default for objects)last- Last value in time bucketmid- Median value (average of middle values for even counts)middle_index- Middle value by index (first of two middle values for even counts)
When to Use Each Method:
- Numeric data (wind speed, voltage, etc.): Use
average,min,maxfor statistics - Position data: Use
first,last,middle_indexfor specific readings - String/object data: Avoid
mid(unpredictable), preferfirst,last,middle_index - Multiple stats: Query same path with different methods (e.g.,
wind:average,wind:max)
Response Format
The History API returns time-aligned data in standard SignalK format.
Default Response (without moving averages)
{
"context": "vessels.self",
"range": {
"from": "2025-01-01T00:00:00Z",
"to": "2025-01-01T06:00:00Z"
},
"values": [
{
"path": "environment.wind.speedApparent",
"method": "average"
},
{
"path": "navigation.position",
"method": "first"
}
],
"data": [
["2025-01-01T00:00:00Z", 12.5, {"latitude": 37.7749, "longitude": -122.4194}],
["2025-01-01T00:01:00Z", 13.2, {"latitude": 37.7750, "longitude": -122.4195}],
["2025-01-01T00:02:00Z", 11.8, {"latitude": 37.7751, "longitude": -122.4196}]
]
}With Moving Averages (includeMovingAverages=true)
{
"context": "vessels.self",
"range": {
"from": "2025-01-01T00:00:00Z",
"to": "2025-01-01T06:00:00Z"
},
"values": [
{
"path": "environment.wind.speedApparent",
"method": "average"
},
{
"path": "environment.wind.speedApparent.ema",
"method": "ema"
},
{
"path": "environment.wind.speedApparent.sma",
"method": "sma"
},
{
"path": "navigation.position",
"method": "first"
}
],
"data": [
["2025-01-01T00:00:00Z", 12.5, 12.5, 12.5, {"latitude": 37.7749, "longitude": -122.4194}],
["2025-01-01T00:01:00Z", 13.2, 12.64, 12.85, {"latitude": 37.7750, "longitude": -122.4195}],
["2025-01-01T00:02:00Z", 11.8, 12.45, 12.5, {"latitude": 37.7751, "longitude": -122.4196}]
]
}Notes:
- Each data array element is
[timestamp, value1, value2, ...]corresponding to the paths in thevaluesarray - Moving averages (EMA/SMA) are opt-in - add
includeMovingAverages=trueto include them - EMA/SMA are only calculated for numeric values; non-numeric values (objects, strings) show
nullfor their EMA/SMA columns - Without
includeMovingAverages, response size is ~66% smaller
Claude AI Analysis
The plugin integrates Claude AI to provide intelligent analysis of maritime data, offering insights that would be difficult to extract through traditional querying methods.
Advanced Charting and Visualization
Claude AI can generate interactive charts and visualizations directly from your data using Plotly.js specifications. Charts are automatically embedded in analysis responses when analysis would benefit from visualization.
Supported Chart Types:
- Line Charts: Time series trends for navigation, environmental, and performance data
- Bar Charts: Categorical analysis and frequency distributions
- Scatter Plots: Correlation analysis between different parameters
- Wind Roses/Radar Charts: Professional wind direction and speed frequency analysis
- Multiple Series Charts: Compare multiple data streams on the same chart
- Polar Charts: Wind patterns, compass headings, and directional data
Marine-Specific Chart Features:
- Wind Analysis: Automated wind rose generation with Beaufort scale categories
- Navigation Plots: Course over ground, speed trends, and position tracking
- Environmental Monitoring: Temperature, pressure, and weather pattern visualization
- Performance Analysis: Fuel efficiency, battery usage, and system performance charts
- Multi-Vessel Comparisons: Side-by-side analysis of multiple vessels
Chart Data Integrity:
- All chart data is sourced directly from database queries - no fabricated or estimated data
- Charts display exact data points from query results with full traceability
- Automatic validation ensures chart data matches query output
- Time-aligned data from History API ensures accurate multi-parameter visualization
Example Chart Generation: When you ask Claude to "analyze wind patterns over the last 48 hours", it will:
- Query your wind direction and speed data
- Generate a wind rose chart showing frequency by compass direction
- Color-code by wind speed categories (calm, light breeze, strong breeze, etc.)
- Display the chart as interactive Plotly.js visualization in the web interface
Charts are automatically included when analysis benefits from visualization, or you can explicitly request specific chart types like "create a line chart" or "show me a wind rose".
PLANNED Analysis Templates NB: NOT YET IMPLEMENTED
EXAMPLES OF POSSIBLE Pre-built analysis templates provide ready-to-use analysis for common maritime operations:
Navigation & Routing Templates
- Navigation Summary: Comprehensive analysis of navigation patterns and route efficiency
- Route Optimization: Identify opportunities to optimize routes for efficiency and safety
- Anchoring Analysis: Analyze anchoring patterns, duration, and safety considerations
Weather & Environment Templates
- Weather Impact Analysis: Analyze how weather conditions affect vessel performance
- Wind Pattern Analysis: Detailed wind analysis for sailing optimization
Electrical System Templates
- Battery Health Assessment: Comprehensive battery performance and charging pattern analysis
- Power Consumption Analysis: Analyze electrical power usage patterns and efficiency
Safety & Monitoring Templates
- Safety Anomaly Detection: Detect unusual patterns that might indicate safety concerns
- Equipment Health Monitoring: Monitor equipment performance and predict maintenance needs
Performance & Efficiency Templates
- Fuel Efficiency Analysis: Analyze fuel consumption patterns and identify efficiency opportunities
- Overall Performance Trends: Comprehensive vessel performance analysis over time
Using Claude AI Analysis
Via Web Interface
- Navigate to the plugin's web interface
- Go to the 🧠 AI Analysis tab
- Select a data path to analyze
- Choose an analysis template or create custom analysis
- Configure time range and analysis parameters
- Click Analyze Data to generate insights
Via API
Test Claude Connection:
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze/test-connectionGet Available Templates:
curl http://localhost:3000/plugins/signalk-parquet/api/analyze/templatesCustom Analysis:
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze \
-H "Content-Type: application/json" \
-d '{
"dataPath": "environment.wind.speedTrue,navigation.speedOverGround",
"analysisType": "custom",
"customPrompt": "Analyze the relationship between wind speed and vessel speed. Identify optimal wind conditions for best performance.",
"timeRange": {
"start": "2025-01-01T00:00:00Z",
"end": "2025-01-07T00:00:00Z"
},
"aggregationMethod": "average",
"resolution": "3600000"
}'Analysis Response Format
Claude AI analysis returns structured insights:
{
"id": "analysis_1234567890_abcdef123",
"analysis": "Main analysis text with detailed insights",
"insights": [
"Key insight 1",
"Key insight 2",
"Key insight 3"
],
"recommendations": [
"Actionable recommendation 1",
"Actionable recommendation 2"
],
"anomalies": [
{
"timestamp": "2025-01-01T12:00:00Z",
"value": 25.5,
"expectedRange": {"min": 10.0, "max": 20.0},
"severity": "medium",
"description": "Wind speed higher than normal range",
"confidence": 0.87
}
],
"confidence": 0.92,
"dataQuality": "High quality data with 98% completeness",
"timestamp": "2025-01-01T15:30:00Z",
"metadata": {
"dataPath": "environment.wind.speedTrue",
"analysisType": "summary",
"recordCount": 1440,
"timeRange": {
"start": "2025-01-01T00:00:00Z",
"end": "2025-01-02T00:00:00Z"
}
}
}Analysis History
All Claude AI analyses are automatically saved and can be retrieved:
Get Analysis History:
curl http://localhost:3000/plugins/signalk-parquet/api/analyze/history?limit=10History files are stored in: data/analysis-history/analysis_*.json
Best Practices
- Data Quality: Ensure good data coverage for more reliable analysis
- Time Ranges: Use appropriate time ranges - longer for trends, shorter for anomalies
- Path Selection: Combine related paths for correlation analysis
- Template Usage: Start with templates then customize prompts as needed
- API Limits: Be mindful of Anthropic API token limits and costs
- Model Selection: Use Opus for complex analysis, Sonnet for general use, Haiku for quick insights
Troubleshooting Claude AI
Common Issues:
- "Claude not enabled": Check plugin configuration and enable Claude integration
- "API key missing": Add valid Anthropic API key in plugin settings
- "Analysis timeout": Reduce data size or use faster model (Haiku)
- "Token limit exceeded": Reduce time range or use data sampling
Debug Claude Integration:
# Test API connection
curl -X POST http://localhost:3000/plugins/signalk-parquet/api/analyze/test-connection
# Check plugin logs for Claude-specific messages
journalctl -u signalk -f | grep -i claudeMoving Averages (EMA & SMA)
The plugin calculates Exponential Moving Average (EMA) and Simple Moving Average (SMA) for numeric values when explicitly requested via the includeMovingAverages parameter, providing enhanced trend analysis capabilities.
How to Enable
History API:
# Add includeMovingAverages=true to any query
curl "http://localhost:3000/signalk/v1/history/values?duration=1h&paths=environment.wind.speedApparent&includeMovingAverages=true"Default Behavior (v0.5.6+):
- Moving averages are opt-in - not included by default
- Reduces response size by ~66% when not needed
- Better API compliance with SignalK specification
Legacy Behavior (pre-v0.5.6):
- Moving averages were automatically included for all queries
- To maintain old behavior, add
includeMovingAverages=trueto all requests
Calculation Details
Exponential Moving Average (EMA)
- Period: ~10 equivalent (α = 0.2)
- Formula:
EMA = α × currentValue + (1 - α) × previousEMA - Characteristic: Responds faster to recent changes, emphasizes recent data
- Use Case: Trend detection, rapid response to data changes
Simple Moving Average (SMA)
- Period: 10 data points
- Formula: Average of the last 10 values
- Characteristic: Smooths out fluctuations, equal weight to all values in window
- Use Case: Noise reduction, general trend analysis
Data Flow & Continuity
// Initial Data Load (isIncremental: false)
Point 1: Value=5.0, EMA=5.0, SMA=5.0
Point 2: Value=6.0, EMA=5.2, SMA=5.5
Point 3: Value=4.0, EMA=5.0, SMA=5.0
// Incremental Updates (isIncremental: true)
Point 4: Value=7.0, EMA=5.4, SMA=5.5 // Continues from previous EMA
Point 5: Value=5.5, EMA=5.42, SMA=5.5 // Rolling 10-point SMA windowKey Features
- 🎛️ Opt-In: Add
includeMovingAverages=trueto enable (v0.5.6+) - ✅ Memory Efficient: SMA maintains rolling 10-point window
- ✅ Non-Numeric Handling: Non-numeric values (strings, objects) show
nullfor EMA/SMA - ✅ Precision: Values rounded to 3 decimal places to prevent floating-point noise
- ⚡ Performance: Smaller response sizes when not needed
Real-world Applications
Marine Data Examples:
- Wind Speed: EMA detects gusts quickly, SMA shows general wind conditions
- Battery Voltage: EMA shows charging/discharging trends, SMA indicates overall battery health
- Engine RPM: EMA responds to throttle changes, SMA shows average operating level
- Water Temperature: EMA detects thermal changes, SMA provides stable baseline
Available in:
- 📊 History API: Add
includeMovingAverages=trueto include EMA/SMA calculations
S3 Integration
Upload Timing
Real-time Upload: Files are uploaded immediately after creation
{
"s3Upload": {
"enabled": true,
"timing": "realtime"
}
}Consolidation Upload: Files are uploaded after daily consolidation
{
"s3Upload": {
"enabled": true,
"timing": "consolidation"
}
}S3 Key Structure
With prefix marine-data/:
marine-data/vessels/self/navigation/position/signalk_data_20250716_consolidated.parquet
marine-data/vessels/self/environment/wind/angleApparent/signalk_data_20250716_120000.parquetFile Consolidation
The plugin automatically consolidates files daily at midnight UTC:
- File Discovery: Finds all files for the previous day
- Merging: Combines files by SignalK path
- Sorting: Sorts records by timestamp
- Cleanup: Moves source files to
processed/directory - S3 Upload: Uploads consolidated files if configured
Performance Characteristics
- Memory Usage: Configurable buffer sizes (default 1000 records)
- Disk I/O: Efficient batch writes with configurable intervals
- CPU Usage: Minimal - mostly I/O bound operations
- Network: Optional S3 uploads with retry logic
Development
Project Structure
signalk-parquet/
├── src/
│ ├── index.ts # Main plugin entry point and lifecycle (~340 lines)
│ ├── commands.ts # Command management system (~400 lines)
│ ├── data-handler.ts # Data processing, subscriptions, S3 (~650 lines)
│ ├── api-routes.ts # Web API endpoints (~600 lines)
│ ├── types.ts # TypeScript interfaces (~360 lines)
│ ├── parquet-writer.ts # File writing logic
│ ├── HistoryAPI.ts # SignalK History API implementation
│ ├── HistoryAPI-types.ts # History API type definitions
│ └── utils/
│ └── path-helpers.ts # Path utility functions
├── dist/ # Compiled JavaScript
├── public/
│ ├── index.html # Web interface
│ └── parquet.png # Plugin icon
├── tsconfig.json # TypeScript configuration
├── package.json # Dependencies and scripts
└── README.md # This fileCode Architecture
The plugin uses a modular TypeScript architecture for maintainability:
index.ts: Plugin lifecycle, configuration, and initializationcommands.ts: SignalK command registration, execution, and managementdata-handler.ts: Data subscriptions, buffering, consolidation, and S3 operationsapi-routes.ts: REST API endpoints for web interfacetypes.ts: Comprehensive TypeScript type definitionsutils/: Utility functions and helpers
Adding New Features
- API Endpoints: Add to
src/api-routes.ts - Data Processing: Extend
src/data-handler.ts - Commands: Modify
src/commands.ts - Types: Add interfaces to
src/types.ts - Claude AI Models: Update
src/claude-models.ts(see below) - Update Documentation: Update README and inline comments
Updating Claude AI Models
When Anthropic releases new models, update the single source of truth in src/claude-models.ts:
export const CLAUDE_MODELS = {
OPUS_4_1: 'claude-opus-4-1-20250805',
OPUS_4: 'claude-opus-4-20250514',
SONNET_4: 'claude-sonnet-4-20250514',
SONNET_4_5: 'claude-sonnet-4-5-20250929',
// Add new models here
} as const;
export const SUPPORTED_CLAUDE_MODELS = [
CLAUDE_MODELS.OPUS_4_1,
CLAUDE_MODELS.OPUS_4,
CLAUDE_MODELS.SONNET_4,
CLAUDE_MODELS.SONNET_4_5,
// Add to supported list
] as const;
export const DEFAULT_CLAUDE_MODEL = CLAUDE_MODELS.SONNET_4_5; // Update default if needed
export const CLAUDE_MODEL_DESCRIPTIONS = {
[CLAUDE_MODELS.OPUS_4_1]: 'Claude Opus 4.1 (Most Capable & Intelligent)',
[CLAUDE_MODELS.OPUS_4]: 'Claude Opus 4 (Previous Flagship)',
[CLAUDE_MODELS.SONNET_4]: 'Claude Sonnet 4 (Balanced Performance)',
[CLAUDE_MODELS.SONNET_4_5]: 'Claude Sonnet 4.5 (Latest Sonnet)',
// Add descriptions for new models
} as const;Why this matters:
- All model definitions are centralized in one file
- Type safety across the entire codebase
- Automatic migration of outdated models on plugin startup
- Prevents form validation errors when users have old model values saved
- No need to update multiple files when adding new models
The plugin automatically migrates old/invalid model values to the current default on startup, preventing configuration save failures.
Type Checking
The plugin uses strict TypeScript configuration:
{
"compilerOptions": {
"strict": true,
"noImplicitAny": true,
"noImplicitReturns": true,
"strictNullChecks": true
}
}Troubleshooting
Common Issues
Build Errors
# Clean and rebuild
npm run clean
npm run buildDuckDB Not Available
- Check that
@duckdb/node-apiis installed - Verify Node.js version compatibility (>=16.0.0)
S3 Upload Failures
- Verify AWS credentials and permissions
- Check S3 bucket exists and is accessible
- Test connection using web interface
No Data Collection
- Verify path configurations are correct
- Check if regimens are properly activated
- Review SignalK logs for subscription errors
Debug Mode
Enable debug logging in SignalK:
{
"settings": {
"debug": "signalk-parquet*"
}
}Runtime Dependencies
@dsnp/parquetjs: Parquet file format support@duckdb/node-api: SQL query engine@aws-sdk/client-s3: S3 upload functionalityfs-extra: Enhanced file system operationsglob: File pattern matchingexpress: Web server framework
Development Dependencies
typescript: TypeScript compiler@types/node: Node.js type definitions@types/express: Express type definitions@types/fs-extra: fs-extra type definitions
License
MIT License - See LICENSE file for details.
Testing
Comprehensive testing procedures are documented in TESTING.md. The testing guide covers:
- Installation and build verification
- Plugin configuration testing
- Web interface functionality
- Data collection validation
- Regimen control testing
- File output verification
- S3 integration testing
- API endpoint testing
- Performance testing
- Error handling validation
Quick Test
# Test plugin health
curl http://localhost:3000/plugins/signalk-parquet/api/health
# Test path configuration
curl http://localhost:3000/plugins/signalk-parquet/api/config/paths
# Test data collection
curl http://localhost:3000/plugins/signalk-parquet/api/paths
# Test History API
curl "http://localhost:3000/signalk/v1/history/contexts"TODO
- [x] Implement startup consolidation for missed previous days (exclude current day)
- [x] Add history API integration
- [ ] Incorporate user preferences from units-preference in the regimen filter system
- [ ] Expose recorded spatial event via api endpoint (geojson)
- [ ] Add Grafana integration
Contributing
- Fork the repository
- Create a feature branch
- Add TypeScript types for new features
- Include tests and documentation
- Follow the testing procedures in
TESTING.md - Submit a pull request
Changelog
See CHANGELOG.md for complete version history.
Version 0.5.6-beta.1 (Latest)
- 🎯 SignalK History API Compliance: Full support for all 5 standard time range patterns
- ⏪ Backward Compatibility: Legacy
startparameter supported with deprecation warnings - 🎛️ Optional Moving Averages: EMA/SMA now opt-in via
includeMovingAveragesparameter - 🔍 Time-Filtered Discovery: Paths and contexts endpoints accept time range parameters
- ⚡ Performance: 4.3x faster context discovery (13s → 3s) with SQL optimization and caching
Version 0.5.5-beta.4
- 🔧 Threshold Automation State Machine Fix: Fixed automation enable/disable transitions to properly execute state changes
- When enabling automation (
.auto = true): Command is now set to OFF, then all thresholds are immediately evaluated - When disabling automation (
.auto = false): Threshold monitoring stops and command state remains unchanged - Default command state is hardcoded to OFF on server side
- Fixed
autoPutHandlerinsrc/commands.tsto execute one-time transition operations when.autopath toggles - Ensures thresholds take control immediately upon automation enable instead of waiting for next SignalK delta update
- Removed user-configurable default state dropdown from UI (both add and edit command forms)
- When enabling automation (
Version 0.5.5-beta.3
- 🧱 Front-end Modularization: Replaced the 5,000-line inline dashboard script with focused JS modules under
public/js, improving readability and maintainability. - ⚙️ Threshold Automation Fix: Threshold monitoring now listens to raw SignalK values via
getSelfStream, so saved trigger conditions reliably toggle their commands.
Version 0.5.5-beta.1
- 🌍 NEW Spatial Analysis System: Advanced geographic analysis capabilities with DuckDB spatial extension
- Complete spatial function integration: ST_Point, ST_Distance_Sphere, ST_Centroid, ST_ConvexHull, ST_AsText
- Automatic spatial extension loading in all DuckDB connections for seamless geographic queries
- Track analysis with consecutive position distance calculations using LAG() window functions
- Movement area analysis with bounding boxes and geographic coordinate system calculations
- Multi-vessel proximity detection for collision avoidance and traffic analysis
- 🧠 Enhanced Claude AI Spatial Intelligence: Claude now automatically suggests spatial queries for geographic questions
- Comprehensive spatial query patterns and examples integrated into Claude's analysis prompts
- Automatic detection of spatial analysis opportunities (track analysis, distance calculations, movement patterns)
- Pre-built spatial query templates for common maritime analysis scenarios
- Advanced movement analysis with time bucketing, centroids, and area calculations
- 📊 Standardized Query Syntax: All queries now use read_parquet() function for better schema flexibility
- Updated HistoryAPI, api-routes, and claude-analyzer to use read_parquet() with union_by_name=true
- Enhanced schema compatibility and better error handling for evolving data structures
- Consistent query patterns across all system components for maintainability
- 📖 Comprehensive Documentation: Enhanced README with spatial analysis examples and capabilities
- Advanced spatial query examples including distance tracking and movement analysis
- Complete spatial function reference with practical maritime use cases
- Multi-vessel proximity analysis templates for collision detection scenarios
Version 0.5.4-beta.1
- 🔍 NEW Data Validation System: Comprehensive Parquet file schema validation against SignalK metadata standards
- Real-time validation of file schemas with progress tracking and cancellation support
- Detects incorrect data types (e.g., numeric strings, boolean strings) in existing files
- Validates against SignalK metadata units (meters, volts, amperes) for proper type mapping
- 🔧 NEW Automated Schema Repair: One-click repair of schema violations with safe backup operations
- Automatic conversion of incorrectly stored data types (UTF8 → DOUBLE, UTF8 → BOOLEAN)
- Creates backup files before modification and quarantines corrupted files
- Processes thousands of files with real-time progress monitoring
- ⚡ Major Performance Fix: Resolved repair hanging issue on ARM systems (Raspberry Pi, AWS)
- Replaced problematic DuckDB command-line spawning with direct parquet library schema reading
- Repair now works reliably across all architectures (x86, ARM64, Apple Silicon)
- Unified schema reading approach between validation and repair for consistency
- 📊 Storage & Query Benefits: Proper data types provide significant performance improvements
- ~50% storage reduction for numeric data (DOUBLE vs UTF8 strings)
- 5-10x faster query performance with native numeric operations
- Enhanced data integrity and analytics compatibility
Version 0.5.3-beta.2
- 🎨 Enhanced User Interface: Major improvements to path configuration and form usability
- Intelligent SignalK path dropdowns with real-time data population
- Radio button filters to distinguish self vessel vs other vessel paths
- Dynamic regimen selection with checkbox interfaces replacing text inputs
- Auto-populated regimens from defined commands API with dynamic updates
- Improved form layout with proper label/checkbox alignment and spacing
- Support for both pre-defined and custom regimen selection
- 🎮 Regimen/Commands Manager Enhancements: Streamlined command management experience
- Renamed "Command Manager" to "Regimen/Commands Manager" for clarity
- Auto-refresh functionality when selecting the tab (eliminates manual refresh)
- Removed redundant manual refresh button for cleaner interface
- Dynamic regimen filtering excludes command paths from path dropdowns
- 🧠 AI Analysis Improvements: Enhanced analysis experience and cancellation controls
- Analysis button shows "Running Analysis (click to cancel)" during processing
- Visual feedback with color changes during analysis execution
- Dual cancellation options: button click or separate cancel button
- Improved error handling distinguishing between cancelled vs failed analyses
- Better prevention of double-clicking issues with clear visual states
- ⚙️ Configuration Management: More flexible path and regimen configuration
- Removed regimen requirement validation from edit forms (optional regimens)
- Enhanced dropdown population excludes already configured command paths
- Better alignment of UI elements with consistent styling patterns
- Tab labels consolidated to single lines for cleaner navigation
Version 0.5.3-beta.1
- 📊 Advanced Charting & Visualization: Comprehensive chart generation capabilities with Claude AI
- Interactive Plotly.js chart embedding with marine-specific visualizations
- Automated wind rose generation with Beaufort scale categories and compass sectors
- Multiple chart types: line charts, bar charts, scatter plots, polar charts, radar charts
- Wind analysis tools with directional frequency distributions
- Chart data integrity validation ensuring all visualizations use real query data
- Time-aligned multi-parameter visualization support via History API
- 🔍 Enhanced Path Discovery: Improved SignalK path discovery and source filtering
- StreamBundle integration for efficient path enumeration
- Better source filtering with wildcard pattern support
- Enhanced debug logging for path discovery troubleshooting
- 🛡️ Parquet File Validation: Added comprehensive corruption detection and quarantin
