sssom-js
v0.4.5
Published
Simple Standard for Sharing Ontology Mappings (SSOM) JavaScript library
Maintainers
Readme
sssom-js
Simple Standard for Sharing Ontology Mappings (SSOM) JavaScript library
This Node package provides methods and a command line client to process mappings in SSSOM format.
It implements parsing variants of SSSOM (TSV, CSV and JSON) with validation and transformation to multiple formats, including JSKOS and RDF.
Table of Contents
Install
Requires Node.js >= 20.19.
npm install sssom-jsFor RDF export in the command line client also install jsonld2rdf.
npm install jsonld2rdfThe web interface can be deployed on any web server by copying static files from directory docs of the source code repository.
Usage
Command line
The package includes a command line client to parse and convert SSSOM. Usage and options:
sssom [options] [<mappings-file> [<metadata-file>]] short| long | argument | description
-----|------------------|----------|-------------
-f | --from | format | input format (csv, tsv, json)
-t | --to | format | output format (json, ndjson, jskos, ndjskos, nq, nt, ttl)
-o | --output | file | output filename or default - for stdout
-p | --propagate | | add propagatable slots to mappings
-b | --liberal | | parse less strict than the specification
-c | --curie | file | additional CURIE map (JSON or YAML file)
-s | --schemes | file | JSKOS concept schemes to detect
-m | --mappings | | emit mappings only
-v | --verbose | | emit error verbosely
-j | --json-errors | | emit errors in JSON (Data Validation Error Format)
-h | --help | | emit usage information
-V | --version | | emit the version number
Web interface
A web interface to validate and transform SSSOM/TSV is made available at https://gbv.github.io/sssom-js/. The application is not included in the package release at npm.
API
import { parseSSSOM, TSVReader, toJskosRegistry, toJskosMapping } from "sssom-js"parseSSSOM (input, options)
This asynchronous function parses SSSOM in an input format from a stream or file and returns a mapping set on success. The result should directly be serializable as SSSOM/JSON (or as JSKOS with option to set to jskos).
import { parseSSSOM } from "sssom-js"
const { mappings, ...metadata } = await parseSSSOM(process.stdin)An untruthy input value will skip processing of mappings so only the mapping set is returned:
const metadata = await parseSSSOM(false, { metadata: "metadata.sssom.yaml" })See below for a description of common options. Additional options are:
- metadataHandler (function) called for parsed metadata
- mappingHandler (function) called for each parsed mapping
parseSSSOMString (input, options)
This is a utility function to parse SSSOM from a string. Equivalent implementation in NodeJS:
parseSSSOMString = (input, options={}) => parseSSSOM(Readable.from(input), options)TSVReader
This event emitter parses SSSOM/TSV from a stream and emits metadata and mapping events:
import fs from "fs"
import { TSVReader } from "sssom-js"
const input = fs.createReadStream("test/valid/minimal.sssom.tsv")
new TSVReader(input)
.on("metadata", console.log)
.on("mapping", console.log)
.on("error", console.error)
.on("end", console.log)
new TSVReader(input, { delimiter: "," }) // parse SSSOM/CSVA second optional argument can be used to pass options propagate (boolean), liberal (boolean), delimiter (string), curie (object), metadata (object), and storeMappings (boolean). The latter makes mappings to be collected and returned with the result at the end of parsing.
toJskosRegistry
Convert a parsed MappingSet to a JSKOS Registry object.
toJskosMapping
Convert a parsed Mapping to a JSKOS Concept Mapping object.
Options
The following options are supported by both the command line client, and the API:
propagate
Enables propagation of mapping set slots. False by default.
liberal
Enabling liberal parsing will
- allow empty mappings block in SSSOM/TSV (but still read and validate the metadata block)
- not require mapping set slots (neither
mapping_set_idnorlicense) so the metadata block can be empty - not require mapping slot
mapping_justification
curie
If you want to allow all CURIE prefixes from Bioregistry without explicitly defining them in curie_map you can download and convert the current list for instance with command line tools curl and jq this way (requires local copy of file bioregistry.jq) and then reference result file bioregistry.json with option --curie:
curl -sL https://w3id.org/biopragmatics/bioregistry.epm.json | \
jq -Sf bioregistry.jq > bioregistry.jsonschemes
JSKOS Concept Schemes to detect when transforming to JSKOS
mappings
Emit mappings only. Metadata is parsed and validated nevertheless.
metadata
Mapping set metadata file in JSON or YAML format for external metadata mode. Is passed as second argument in the command line client or as named option in the API. The API also accepts a parsed object.
Validation errors
Validation errors follow the Data Validation Error Format in condense form. Each error is an object with three fields:
messagean error messagevaluean optional value that caused the errorpositionan optional object mapping locator types to error locations. The following locator types are used:line: a line number given as string, starting with1for the first linelinecol: line a column number (both starting with1), separated by:rfc5147: line span conforming to RFC 5147, for instanceline=2,4for line 3 (!) to 4jsonpointer: JSON Pointer to the malformed YAML or JSON element (for instance/creator_id)
Formats
Input format and output format can be specified via command line options from and to, and in the web interface.
of the mappings, given as string. The following formats are supported so far:
format | description | from | to | API
---------|---------------|------|----|-----
tsv | SSSOM/TSV | yes | - | yes
csv | SSSOM/CSV | yes | - | yes
json | SSSOM/JSON/JSON-LD | yes | yes | yes
ndjson | metadata and mappings on individual lines (SSSOM/JSON) | no | yes | -
jskos | JSKOS | - | to | yes |
ndjskos| metadata and mappings on individual lines (JSKOS) | no | yes | -
nq | NQuads (RDF) of raw mappings | - | no | yes | -
nt | NTriples (RDF) | - | requires jsonld2rdf | -
ttl | RDF/Turtle (RDF) | - | requires jsonld2rdf | -
If not specified, formats are guessed from file name with fallback to tsv (from) and ndjson (to).
Formats json, jskos, nt, and ttl require to fully load the input into memory for processing, the other formats support streaming processing.
NQuads format (nq) is limited to the raw mapping statements without metadata and additional slots except subject_id, predicate_id, object_id, and optional mapping_set_id. Combine with option -m, --mappings to omit the latter, resulting in NTriples format of raw mappings.
RDF
RDF serialisation of SSSOM has not fully been specified yet. This package uses a JSON-LD context to transform SSSOM/JSON to SSSOM/RDF, except for NQuad (nq) output format that only consists of one triple per mapping.
The following slots are not included because semantics are not clear (yet) or because their content better belongs to another place:
- other
- mapping_tool, mapping_tool_id and mapping_tool_version
- subject_label, subject_source, subject_source_version, subject_type, object_source, object_source_version, object_type, predicate_label
- predicate_type, object_category
- object_match_field, object_preprocessing, subject_match_field, subject_preprocessing, similarity_measure, match_string, similarity_score - see this discussion
- extension_definitions
- predicate_modifier
- mapping_cardinality
The following slots will likely be included once a good existing predicate URI has been found:
JSKOS
The JSKOS data format is used in terminology applications for controlled vocabularies and their mappings.
The following correspondence between SSSOM and JSKOS has not fully been implemented yet. Some JSKOS fields will only be available since version 0.7.0 of JSKOS specification.
Common slots
SSSOM slot | JSKOS field
------------|------------
comment | note.und[]
creator_id | contributor[].uri
creator_label | contributor[].prefLabel.und
publication_date | published
see_also | ?
other | -
Propagatable slots
SSSOM slot | JSKOS field
-----------|------------
mapping_date | created
mapping_provider |publisher[].url
mapping_tool | tool[].url* (0.7.0)
mapping_tool_id | tool[].uri* (0.7.0)
mapping_tool_version | tool[].version* (0.7.0)
object_source | to.memberSet[].inScheme[].uri
object_source_version | to.memberSet[].inScheme[].version (0.7.0)
object_type | from.memberSet[].type (URI, limited list)
subject_source | from.memberSet[].inScheme
subject_source_version | from.memberSet[].inScheme[].version (0.7.0)
subject_type | from.memberSet[].type (URI, limited list)
predicate_type | -
object_match_field | - (see this discussion)
object_preprocessing | - (dito)
subject_match_field | - (dito)
subject_preprocessing | - (dito)
similarity_measure | - (dito)
* The correspondence of slots mapping_tool , mapping_tool_id, and mapping_tool_version is slightly more complicated.
Mapping set slots
SSSOM slot | JSKOS field
-----------|------------
curie_map | -
license | license.uri
mappings | mappings (of a registry or concordance)
mapping_set_id | uri
mapping_set_version | version (0.7.0)
mapping_set_source | source
mapping_set_title | prefLabel.und
mapping_set_description | definition
issue_tracker | issueTracker (0.7.0)
predicate_label | -
extension_definitions | -
Mapping slots
SSSOM slot | JSKOS field
------------|------------
mapping_id | uri
subject_id | from.memberSet[].uri
subject_label | from.memberSet[].prefLabel
subject_category | -
predicate_id | type
predicate_label | - (implied by type)
object_id | to.memberSet[].uri
object_label | to.memberSet[].prefLabel
object_category | -
mapping_justification | justification (0.7.0)
author_id | creator[].uri
author_label | creator[].prefLabel
reviewer_id | annotations[].creator.id
reviewer_label | annotations[].creator.name
mapping_source | source
confidence | mappingRelevance
curation_rule | guidelines (0.7.0)
curation_rule_text | guidelines[].prefLabel (0.7.0)
issue_tracker_item | issue (0.7.0)
license | - (only for mapping sets)
predicate_modifier | -
mapping_cardinality | -
match_string | - (see this discussion)
similarity_score | - (dito)
Limitations
This library follows the SSSOM specification as close as possible, but it does not aim to be a fully compliant implementation. The latter would require to also comply to LinkML, a specification much more complex then needed for SSSOM and not fully been implemented in JavaScript yet. In particular:
- All slots of type Uri must be absolute URIs as defined in RFC 3986
- Literal Mappings are not supported
- Non-standard slots are not supported:
- mapping set slot
extension_definitionis ignored - mapping set slot
otheris read and validated but not used
- mapping set slot
- SSSOM/JSON, the JSON serialization of SSSOM has not been specified yet, so it may differ from the JSON(-LD) format used in this library. Some applies to the RDF serialization.
- Propagation silently overwrites existing mapping slots instead of raising an error
- Uniqueness of mapping slot
mapping_idis not checked.
Survey
Directory survey contains a survey of published SSSOM data with validation results. See dev branch for most recent update.
Maintainers
- @nichtich (Jakob Voß)
Contribute
Contributions are welcome! Best use the issue tracker for questions, bug reports, and/or feature requests!
License
MIT license
