member-extraction-algorithm
v0.0.1
Published
Member Extraction Algorithm to get all quads related to an entity in a RDF graph using named graphs.
Readme
Member Extraction Algorithm
member-extraction-algorithm is a TypeScript library that lets you keep many members' RDF statements together in one
RDF resource (represented by a RdfStore — see rdf-stores), and later
extract each member's original statements again.
It provides a deterministic round-trip between:
createStore, which serializes per-member quads into a member-centric named-graph layout; andextract, which reconstructs the per-member quads from that layout.
Core Idea
The approach uses the graph position of each stored quad to represent which member the statement belongs to.
Because that graph position is already used for member identity, it cannot simultaneously hold the original graph of a non-default graph input quad. To preserve that original graph, the library stores metadata using RDF 1.2 triple terms:
rdf:reifiespoints to the original(subject, predicate, object)triple term;mea:graphstores the original graph IRI (ormea:DefaultGraphfor triples).
extract resolves that metadata back into the original triple/quad form.
Installation
npm install member-extraction-algorithmWhat The Library Does
The package exposes two core functions:
extract(store, members?, memberProperty?, root?): extracts the quads that belong to each member from an RDF store.createStore(membersQuads, members?, memberProperty?, root?): creates an RDF store in the encoding expected byextract, including metadata for non-default graph quads.
Together, these functions make it possible to deterministically:
- serialize multiple members' statements into one store (RDF resource);
- retain enough information to restore original non-default graph terms; and
- recover the quads per member in a stable member-by-member structure.
Inputs And Data Model
Required data structure
The algorithm expects an RDF store implementing rdf-stores semantics.
For non-default graph round-tripping, the store and surrounding tooling must support RDF 1.2 triple terms used by
rdf:reifies.
extract input parameters
store: RdfStore(required)- dataset to extract members from.
members?: Term[](optional)- explicit member identifiers.
- if omitted, members are discovered from
root memberProperty ?memberin the default graph.
memberProperty: Quad_Predicate(optional, defaulthttps://w3id.org/mea#member)- predicate used to link the root resource to member identifiers.
root?: Quad_Subject(optional)- subject used for membership statements.
- if omitted, membership lookup is performed with an unspecified subject (
null) in the default graph.
createStore input parameters
membersQuads: Quad[][](required)- one array of quads per member.
members?: (NamedNode | BlankNode | Variable)[](optional)- explicit member identifiers.
- if omitted,
urn:uuid:*identifiers are generated.
memberProperty: Quad_Predicate(optional, defaulthttps://w3id.org/mea#member)root: Quad_Subject(optional, default to the relative base IRI'')
Quick Usage
import {DataFactory} from 'rdf-data-factory';
import {RdfStore} from 'rdf-stores';
import {extract, createStore} from 'member-extraction-algorithm';
const df = new DataFactory();
const store = RdfStore.createDefault();
const root = df.namedNode('http://example.org/collection');
const member = df.namedNode('http://example.org/member/1');
const memberProperty = df.namedNode('https://w3id.org/mea#member');
// Membership declaration in default graph
store.addQuad(df.quad(root, memberProperty, member));
// Member content is stored in the member named graph
store.addQuad(df.quad(
df.namedNode('http://example.org/member/1'),
df.namedNode('http://schema.org/name'),
df.literal('Alice'),
member,
));
const extracted = extract(store, [member], memberProperty, root);
console.log(extracted[0]); // quads for member/1
// Optional: construct a compatible store from per-member quads
const reconstructed = createStore(extracted, [member], memberProperty, root);In-Depth: How The Extraction Algorithm Works
For each target member, the algorithm performs the following steps.
Collect member graph statements
- all quads in graph
memberare retrieved.
- all quads in graph
Identify possible metadata statements
- statements with predicate
rdf:reifiesormea:graphare not immediately returned. - they are temporarily tracked as "possible metadata" to avoid leaking serialization internals.
- statements with predicate
Treat each non-metadata statement as a candidate triple
- from
(s, p, o, member), the algorithm creates the triple(s, p, o).
- from
Look for flattening metadata linked to that triple
- find reifier terms with
(reifier, rdf:reifies, (s,p,o), member). - for each reifier, read
(reifier, mea:graph, g, member).
- find reifier terms with
Reconstruct original output shape
- if no
mea:graphis found: output(s, p, o)as default-graph triple. - if
mea:graph = mea:DefaultGraph: output(s, p, o)as default-graph triple. - otherwise: output
(s, p, o, g)as quad in graphg.
- if no
Resolve ambiguous metadata
- metadata-like statements that were never matched to a candidate triple are emitted as regular data triples.
- this preserves correctness when users have real data that happens to use
rdf:reifiesormea:graph.
In combination with createStore, this gives a deterministic workflow:
- start from
Quad[][]grouped per member; - store everything in one
RdfStorewhile using graph = member; - later extract and recover the quads belonging to each member, including reconstructed non-default graph quads.
Why this design
The algorithm prioritizes lossless recovery of original member data while remaining robust to mixed datasets that contain both:
- serialization metadata; and
- domain-level statements with overlapping predicates.
This behavior is useful when data passes through heterogeneous pipelines and transformations.
API Surface
extract(
store
:
RdfStore,
members ? : Term[],
memberProperty ? : Quad_Predicate,
root ? : Quad_Subject,
):
Quad[][]
createStore(
membersQuads
:
Quad[][],
members ? : (NamedNode | BlankNode | Variable)[],
memberProperty ? : Quad_Predicate,
root ? : Quad_Subject,
):
RdfStoreNotes And Limitations
- extraction returns one
Quad[]per member in input order; - if
membersis omitted, discovery depends on membership statements in the default graph; - graph reconstruction relies on the
rdf:reifies+mea:graphconvention and RDF 1.2 triple-term support; - current README examples are minimal and intended as a starting point for experiments.
Citation
If you use this library in scholarly work, consider citing the associated publication (to be added).
License
This software is written by Ieben Smessaert.
This code is released under the MIT License.
