@solid/content-moderation-sightengine

v2.0.0

Published

12 days ago

Content moderation plugin for Community Solid Server using SightEngine API

CSS Moderation Plugin

A Community Solid Server (CSS) plugin that provides content moderation for uploads using the SightEngine API. Supports image, text, and video moderation with configurable thresholds.

Features

Image Moderation: Detects nudity, violence/gore, weapons, alcohol, drugs, offensive symbols, self-harm, and gambling
Text Moderation: Detects sexual, discriminatory, insulting, violent, toxic, and self-harm content, plus personal info (PII)
Video Moderation: Frame-by-frame analysis for all image moderation categories plus tobacco
Configurable Thresholds: Set custom thresholds via Components.js config or environment variables
Audit Logging: JSON Lines audit log with pod name and agent (WebID) tracking
Fail-Open Policy: API errors allow uploads to proceed (configurable)
MIME Type Bypass Protection: Magic byte detection, extension validation, and unknown type handling

Project Structure

cms-solid/
├── src/
│   ├── ModerationOperationHandler.ts   # Main handler class with SightEngine integration
│   └── index.ts                        # Export barrel
├── components/
│   └── components.jsonld               # Component definition with configurable parameters
├── dist/                               # Compiled JavaScript
├── package.json
└── tsconfig.json

How It Works

The plugin wraps CSS's default OperationHandler with a ModerationOperationHandler that intercepts PUT/POST operations and moderates content before storage.

Request Flow:
┌─────────────┐    ┌──────────────────┐    ┌───────────────────────────┐    ┌──────────────────┐
│ HTTP Request │ → │ AuthorizingHandler│ → │ ModerationOperationHandler │ → │ WaterfallHandler │
└─────────────┘    └──────────────────┘    └───────────────────────────┘    └──────────────────┘
                                                      ↓
                                           [SightEngine API Moderation]
                                                      ↓
                                           [Audit Log Entry Written]

Installation

Build the plugin:
```
cd cms-solid
npm install
npm run build
```

Link to CSS:

npm link
cd /path/to/CommunitySolidServer
npm link @solid/content-moderation-sightengine

Set up SightEngine API credentials:

export SIGHTENGINE_API_USER="your_api_user"
export SIGHTENGINE_API_SECRET="your_api_secret"

Create the moderation config in CSS:
Create config/moderation.json in your CSS directory (see Configuration section below).

Start CSS with the moderation config:

cd /path/to/CommunitySolidServer
SIGHTENGINE_API_USER="..." SIGHTENGINE_API_SECRET="..." \
  node bin/server.js -c config/moderation.json -f data/ -p 3009

Configuration

Minimal Configuration (config/moderation.json)

This is a minimal configuration that uses default thresholds. For a full production configuration with all options, see the Full Configuration Example below.

{
  "@context": [
    "https://linkedsoftwaredependencies.org/bundles/npm/componentsjs/^5.0.0/components/context.jsonld",
    "https://linkedsoftwaredependencies.org/bundles/npm/@solid/community-server/^7.0.0/components/context.jsonld",
    {
      "scms": "https://linkedsoftwaredependencies.org/bundles/npm/@solid/content-moderation-sightengine/"
    }
  ],
  "import": [
    "css:config/app/main/default.json",
    "css:config/app/init/default.json",
    "css:config/app/variables/default.json",
    "css:config/http/handler/default.json",
    "css:config/http/middleware/default.json",
    "css:config/http/notifications/all.json",
    "css:config/http/server-factory/http.json",
    "css:config/http/static/default.json",
    "css:config/identity/access/public.json",
    "css:config/identity/email/default.json",
    "css:config/identity/handler/default.json",
    "css:config/identity/oidc/default.json",
    "css:config/identity/ownership/token.json",
    "css:config/identity/pod/static.json",
    "css:config/ldp/authentication/dpop-bearer.json",
    "css:config/ldp/authorization/webacl.json",
    "css:config/ldp/handler/components/authorizer.json",
    "css:config/ldp/handler/components/error-handler.json",
    "css:config/ldp/handler/components/operation-handler.json",
    "css:config/ldp/handler/components/operation-metadata.json",
    "css:config/ldp/handler/components/preferences.json",
    "css:config/ldp/handler/components/request-parser.json",
    "css:config/ldp/handler/components/response-writer.json",
    "css:config/ldp/metadata-parser/default.json",
    "css:config/ldp/metadata-writer/default.json",
    "css:config/ldp/modes/default.json",
    "css:config/storage/backend/file.json",
    "css:config/storage/key-value/resource-store.json",
    "css:config/storage/location/pod.json",
    "css:config/storage/middleware/default.json",
    "css:config/util/auxiliary/acl.json",
    "css:config/util/identifiers/suffix.json",
    "css:config/util/index/default.json",
    "css:config/util/logging/winston.json",
    "css:config/util/representation-conversion/default.json",
    "css:config/util/resource-locker/file.json",
    "css:config/util/variables/default.json"
  ],
  "@graph": [
    {
      "comment": "The main entry point into the main Solid behaviour - wired with moderation",
      "@id": "urn:solid-server:default:LdpHandler",
      "@type": "ParsingHttpHandler",
      "args_requestParser": { "@id": "urn:solid-server:default:RequestParser" },
      "args_errorHandler": { "@id": "urn:solid-server:default:ErrorHandler" },
      "args_responseWriter": { "@id": "urn:solid-server:default:ResponseWriter" },
      "args_operationHandler": {
        "@type": "AuthorizingHttpHandler",
        "args_credentialsExtractor": { "@id": "urn:solid-server:default:CredentialsExtractor" },
        "args_modesExtractor": { "@id": "urn:solid-server:default:ModesExtractor" },
        "args_permissionReader": { "@id": "urn:solid-server:default:PermissionReader" },
        "args_authorizer": { "@id": "urn:solid-server:default:Authorizer" },
        "args_operationHandler": {
          "@type": "WacAllowHttpHandler",
          "args_credentialsExtractor": { "@id": "urn:solid-server:default:CredentialsExtractor" },
          "args_modesExtractor": { "@id": "urn:solid-server:default:ModesExtractor" },
          "args_permissionReader": { "@id": "urn:solid-server:default:PermissionReader" },
          "args_operationHandler": { "@id": "urn:solid-server:default:ModerationOperationHandler" }
        }
      }
    },
    {
      "comment": "Moderation handler with default thresholds",
      "@id": "urn:solid-server:default:ModerationOperationHandler",
      "@type": "scms:ModerationOperationHandler",
      "scms:ModerationOperationHandler#source": {
        "@id": "urn:solid-server:default:OperationHandler"
      }
    }
  ]
}

Configurable Parameters

All thresholds are values between 0 and 1. Higher values = more permissive.

| Parameter | Default | Description | |-----------|---------|-------------| | nudityThreshold | 0.5 | Threshold for nudity detection | | violenceThreshold | 0.5 | Threshold for violence/gore detection | | weaponThreshold | 0.5 | Threshold for weapon detection | | alcoholThreshold | 0.8 | Threshold for alcohol detection | | drugsThreshold | 0.5 | Threshold for drugs detection | | offensiveThreshold | 0.5 | Threshold for offensive symbols | | selfharmThreshold | 0.3 | Threshold for self-harm detection | | gamblingThreshold | 0.5 | Threshold for gambling detection | | tobaccoThreshold | 0.5 | Threshold for tobacco detection (video) | | textSexualThreshold | 0.5 | Threshold for sexual text content | | textDiscriminatoryThreshold | 0.5 | Threshold for discriminatory text | | textInsultingThreshold | 0.5 | Threshold for insulting text | | textViolentThreshold | 0.5 | Threshold for violent text | | textToxicThreshold | 0.5 | Threshold for toxic text | | textSelfharmThreshold | 0.3 | Threshold for self-harm text | | enabledChecks | nudity,gore,wad,offensive | Comma-separated image checks | | enabledTextChecks | sexual,discriminatory,insulting,violent,toxic,self-harm,personal | Comma-separated text checks | | enabledVideoChecks | nudity,gore,wad,offensive,self-harm,gambling,tobacco | Comma-separated video checks | | auditLogEnabled | true | Enable/disable audit logging | | auditLogPath | ./moderation-audit.log | Path to audit log file |

Environment Variables

Environment variables can be used as fallbacks when Components.js parameters are not set:

# API Credentials (required)
SIGHTENGINE_API_USER=your_user_id
SIGHTENGINE_API_SECRET=your_secret

# Thresholds (optional, Components.js config takes priority)
MODERATION_THRESHOLD_NUDITY=0.5
MODERATION_THRESHOLD_VIOLENCE=0.5
MODERATION_THRESHOLD_WEAPON=0.5
MODERATION_THRESHOLD_ALCOHOL=0.8
MODERATION_THRESHOLD_DRUGS=0.5
MODERATION_THRESHOLD_OFFENSIVE=0.5
MODERATION_THRESHOLD_SELFHARM=0.3
MODERATION_THRESHOLD_GAMBLING=0.5
MODERATION_THRESHOLD_TOBACCO=0.5

# Text thresholds
MODERATION_THRESHOLD_TEXT_SEXUAL=0.5
MODERATION_THRESHOLD_TEXT_DISCRIMINATORY=0.5
MODERATION_THRESHOLD_TEXT_INSULTING=0.5
MODERATION_THRESHOLD_TEXT_VIOLENT=0.5
MODERATION_THRESHOLD_TEXT_TOXIC=0.5
MODERATION_THRESHOLD_TEXT_SELFHARM=0.3

# Enabled checks
MODERATION_CHECKS=nudity,gore,wad,offensive
MODERATION_TEXT_CHECKS=sexual,discriminatory,insulting,violent,toxic,self-harm,personal
MODERATION_VIDEO_CHECKS=nudity,gore,wad,offensive,self-harm,gambling,tobacco

# Security options (MIME type bypass protection)
MODERATION_REJECT_UNKNOWN_TYPES=true    # Reject unknown/unrecognized MIME types
MODERATION_VALIDATE_EXTENSIONS=true     # Validate file extension matches Content-Type
MODERATION_MODERATE_UNKNOWN_TYPES=true  # Detect actual type via magic bytes and moderate

# Audit logging
MODERATION_AUDIT_LOG=true
MODERATION_AUDIT_LOG_PATH=./moderation-audit.log

Security: MIME Type Bypass Protection

The plugin includes defenses against attackers trying to bypass moderation by using fake Content-Type headers (e.g., Content-Type: dont/moderate+jpeg).

Security Options

| Option | Config Key | Env Var | Default | Description | |--------|-----------|---------|---------|-------------| | Magic Byte Detection | N/A | N/A | Always on | Verifies file magic bytes match claimed Content-Type | | Reject Unknown Types | rejectUnknownTypes | MODERATION_REJECT_UNKNOWN_TYPES | false | Reject uploads with unrecognized MIME types | | Extension Validation | validateExtensions | MODERATION_VALIDATE_EXTENSIONS | false | Reject if file extension doesn't match Content-Type | | Moderate Unknown Types | moderateUnknownTypes | MODERATION_MODERATE_UNKNOWN_TYPES | false | Detect actual type via magic bytes and moderate anyway | | Moderate RDF as Text | moderateRdfAsText | MODERATION_MODERATE_RDF_AS_TEXT | false | Extract and moderate text content from RDF/Linked Data |

Recommended Security Configuration

For maximum security, enable all options:

{
  "@id": "urn:solid-server:default:ModerationOperationHandler",
  "@type": "scms:ModerationOperationHandler",
  "source": { "@id": "urn:solid-server:default:OperationHandler" },
  "rejectUnknownTypes": true,
  "validateExtensions": true,
  "moderateUnknownTypes": true,
  "moderateRdfAsText": true
}

How It Works

Magic Byte Detection (always active): When moderating images/videos, the handler verifies the first bytes match the expected format (e.g., JPEG starts with FFD8FF)
Reject Unknown Types: If enabled, any Content-Type not in the allowed list (images, video, text, RDF, Solid formats) is rejected with 403 Forbidden
Extension Validation: If enabled, cross-checks the file extension against Content-Type (e.g., .jpg must have image/jpeg)
Moderate Unknown Types: If enabled and the Content-Type is unrecognized, the handler uses magic bytes to detect the actual type and moderates it appropriately
Moderate RDF as Text: If enabled, extracts text content (string literals, labels, comments) from RDF/Linked Data formats and sends it to the text moderation API

RDF/Linked Data Text Moderation

When moderateRdfAsText is enabled, the plugin extracts text content from the following formats:

| Format | MIME Type | Extraction Method | |--------|-----------|-------------------| | Turtle | text/turtle | String literals ("text", 'text') | | JSON-LD | application/ld+json | All string values (excluding URLs) | | N-Triples | application/n-triples | String literals | | N-Quads | application/n-quads | String literals | | RDF/XML | application/rdf+xml | Text content between tags | | SPARQL Query | application/sparql-query | String literals and comments | | SPARQL Update | application/sparql-update | String literals and comments | | SPARQL Results JSON | application/sparql-results+json | Literal binding values | | SPARQL Results XML | application/sparql-results+xml | Literal element content |

This catches scenarios like:

Turtle files containing hate speech in literal values
JSON-LD documents with offensive text in descriptions
SPARQL queries with inappropriate content in comments

Audit Logging

The plugin writes audit log entries in JSON Lines format:

{"timestamp":"2026-01-26T20:27:00.834Z","action":"REJECT","contentType":"image","path":"http://localhost:3009/alice/photos/image.jpg","pod":"alice","agent":"https://alice.example.org/profile/card#me","mimeType":"image/jpeg","reason":"Content rejected due to policy violations: violence/gore (score: 0.65)","scores":{"gore":0.65},"requestId":"req_abc123"}
{"timestamp":"2026-01-26T20:28:15.123Z","action":"ALLOW","contentType":"image","path":"http://localhost:3009/bob/documents/photo.png","pod":"bob","mimeType":"image/png","scores":{"nudity":0.12,"gore":0.05}}

Audit Log Fields

| Field | Description | |-------|-------------| | timestamp | ISO 8601 timestamp | | action | ALLOW, REJECT, or ERROR | | contentType | image, text, or video | | path | Full resource URL | | pod | Pod name extracted from path | | agent | WebID of the authenticated user (if available) | | mimeType | Content-Type of the upload | | reason | Rejection reason (for REJECT actions) | | scores | Object with moderation scores | | requestId | SightEngine request ID |

Querying the Audit Log

# View all rejections
cat moderation-audit.log | jq 'select(.action == "REJECT")'

# View rejections for a specific pod
cat moderation-audit.log | jq 'select(.action == "REJECT" and .pod == "alice")'

# Count by action type
cat moderation-audit.log | jq -s 'group_by(.action) | map({action: .[0].action, count: length})'

Supported Content Types

Images

image/jpeg, image/jpg, image/png, image/gif, image/webp, image/bmp

Text

text/plain, text/html, text/markdown, text/csv
application/json, application/xml, text/xml

Video

video/mp4, video/mpeg, video/quicktime, video/x-msvideo
video/x-ms-wmv, video/webm, video/ogg, video/3gpp, video/3gpp2

Testing

Manual Testing

# Test image upload
curl -X PUT -H "Content-Type: image/jpeg" \
  --data-binary @test-image.jpg \
  http://localhost:3009/test-image.jpg

# Test text upload
curl -X PUT -H "Content-Type: text/plain" \
  -d "Hello world, this is a test message" \
  http://localhost:3009/test.txt

# Check server logs
tail -f /path/to/css/server.log | grep Moderation

# Check audit log
tail -f moderation-audit.log | jq .

Automated Testing

This project includes automated unit and integration tests using Jest and ts-jest:

# Run all tests (unit + integration)
npm test

# Run only unit tests
npm run test:unit

# Run only integration tests
npm run test:integration

# Run tests with coverage report
npm run test:coverage

# Run tests in watch mode during development
npm run test:watch

Test Structure

test/
├── config/             # Test-specific CSS configurations
│   └── moderation-memory.json
├── fixtures/           # Mock SightEngine API responses
│   └── mock-responses.ts
├── integration/        # Integration tests (real CSS server)
│   └── Server.test.ts
└── unit/               # Unit tests (mocked dependencies)
    └── ModerationOperationHandler.test.ts

Test Categories

Unit Tests (18 tests) - Mock external dependencies:

Request delegation - verifies canHandle() delegates to source handler
GET request passthrough - ensures read operations bypass moderation
Image moderation - tests for nudity, violence, weapons detection with threshold checks
Text moderation - tests for toxic, sexual, discriminatory content detection
API error handling - verifies fail-open policy on API errors
API credentials - confirms moderation is skipped without credentials
Threshold configuration - validates custom threshold settings work correctly
Non-moderated content - ensures unsupported content types pass through

Integration Tests (8 tests) - Real CSS server with in-memory storage:

Server startup - verifies server responds correctly
Text uploads - tests PUT and GET for text content
Resource operations - tests DELETE and HEAD requests
Content type handling - tests JSON and HTML content types

Writing New Tests

Unit tests mock the global fetch function to simulate SightEngine API responses:

global.fetch = jest.fn(() => Promise.resolve({
  ok: true,
  json: () => Promise.resolve(mockSafeImageResponse),
})) as jest.Mock;

Use fake timers to handle the constructor's delayed warning:

beforeEach(() => {
  jest.useFakeTimers();
  // ... create handler
  jest.advanceTimersByTime(100);
});

afterEach(() => {
  jest.useRealTimers();
});

Integration tests use AppRunner to start a real CSS instance:

import { App, AppRunner, joinFilePath } from '@solid/community-server';

let app: App;

beforeAll(async () => {
  app = await new AppRunner().create({
    config: joinFilePath(__dirname, '../config/moderation-memory.json'),
    loaderProperties: {
      mainModulePath: joinFilePath(__dirname, '../../'),
      dumpErrorState: false,
    },
    shorthand: { port: 3456, loggingLevel: 'off' },
  });
  await app.start();
});

afterAll(async () => {
  await app.stop();
});

Full Configuration Example

The following is the actual working configuration used in production. This example uses file-based storage with pod locations and includes all security options enabled:

{
  "@context": [
    "https://linkedsoftwaredependencies.org/bundles/npm/componentsjs/^5.0.0/components/context.jsonld",
    "https://linkedsoftwaredependencies.org/bundles/npm/@solid/community-server/^7.0.0/components/context.jsonld",
    {
      "scms": "https://linkedsoftwaredependencies.org/bundles/npm/@solid/content-moderation-sightengine/"
    }
  ],
  "import": [
    "css:config/app/main/default.json",
    "css:config/app/init/default.json",
    "css:config/app/variables/default.json",
    "css:config/http/handler/default.json",
    "css:config/http/middleware/default.json",
    "css:config/http/notifications/all.json",
    "css:config/http/server-factory/http.json",
    "css:config/http/static/default.json",
    "css:config/identity/access/public.json",
    "css:config/identity/email/default.json",
    "css:config/identity/handler/default.json",
    "css:config/identity/oidc/default.json",
    "css:config/identity/ownership/token.json",
    "css:config/identity/pod/static.json",
    "css:config/ldp/authentication/dpop-bearer.json",
    "css:config/ldp/authorization/webacl.json",
    "css:config/ldp/handler/components/authorizer.json",
    "css:config/ldp/handler/components/error-handler.json",
    "css:config/ldp/handler/components/operation-handler.json",
    "css:config/ldp/handler/components/operation-metadata.json",
    "css:config/ldp/handler/components/preferences.json",
    "css:config/ldp/handler/components/request-parser.json",
    "css:config/ldp/handler/components/response-writer.json",
    "css:config/ldp/metadata-parser/default.json",
    "css:config/ldp/metadata-writer/default.json",
    "css:config/ldp/modes/default.json",
    "css:config/storage/backend/file.json",
    "css:config/storage/key-value/resource-store.json",
    "css:config/storage/location/pod.json",
    "css:config/storage/middleware/default.json",
    "css:config/util/auxiliary/acl.json",
    "css:config/util/identifiers/suffix.json",
    "css:config/util/index/default.json",
    "css:config/util/logging/winston.json",
    "css:config/util/representation-conversion/default.json",
    "css:config/util/resource-locker/file.json",
    "css:config/util/variables/default.json"
  ],
  "@graph": [
    {
      "comment": "The main entry point into the main Solid behaviour - wired with moderation",
      "@id": "urn:solid-server:default:LdpHandler",
      "@type": "ParsingHttpHandler",
      "args_requestParser": { "@id": "urn:solid-server:default:RequestParser" },
      "args_errorHandler": { "@id": "urn:solid-server:default:ErrorHandler" },
      "args_responseWriter": { "@id": "urn:solid-server:default:ResponseWriter" },
      "args_operationHandler": {
        "@type": "AuthorizingHttpHandler",
        "args_credentialsExtractor": { "@id": "urn:solid-server:default:CredentialsExtractor" },
        "args_modesExtractor": { "@id": "urn:solid-server:default:ModesExtractor" },
        "args_permissionReader": { "@id": "urn:solid-server:default:PermissionReader" },
        "args_authorizer": { "@id": "urn:solid-server:default:Authorizer" },
        "args_operationHandler": {
          "@type": "WacAllowHttpHandler",
          "args_credentialsExtractor": { "@id": "urn:solid-server:default:CredentialsExtractor" },
          "args_modesExtractor": { "@id": "urn:solid-server:default:ModesExtractor" },
          "args_permissionReader": { "@id": "urn:solid-server:default:PermissionReader" },
          "args_operationHandler": { "@id": "urn:solid-server:default:ModerationOperationHandler" }
        }
      }
    },
    {
      "comment": "Moderation handler wrapping the default operation handler with custom thresholds",
      "@id": "urn:solid-server:default:ModerationOperationHandler",
      "@type": "scms:ModerationOperationHandler",
      "scms:ModerationOperationHandler#source": {
        "@id": "urn:solid-server:default:OperationHandler"
      },

      "scms:ModerationOperationHandler#nudityThreshold": 0.7,
      "scms:ModerationOperationHandler#violenceThreshold": 0.6,
      "scms:ModerationOperationHandler#weaponThreshold": 0.5,
      "scms:ModerationOperationHandler#alcoholThreshold": 0.8,
      "scms:ModerationOperationHandler#drugsThreshold": 0.5,
      "scms:ModerationOperationHandler#offensiveThreshold": 0.5,
      "scms:ModerationOperationHandler#selfharmThreshold": 0.3,
      "scms:ModerationOperationHandler#gamblingThreshold": 0.5,
      "scms:ModerationOperationHandler#tobaccoThreshold": 0.5,

      "scms:ModerationOperationHandler#textSexualThreshold": 0.5,
      "scms:ModerationOperationHandler#textDiscriminatoryThreshold": 0.5,
      "scms:ModerationOperationHandler#textInsultingThreshold": 0.5,
      "scms:ModerationOperationHandler#textViolentThreshold": 0.5,
      "scms:ModerationOperationHandler#textToxicThreshold": 0.5,
      "scms:ModerationOperationHandler#textSelfharmThreshold": 0.3,

      "scms:ModerationOperationHandler#enabledChecks": "nudity,gore,wad,offensive",
      "scms:ModerationOperationHandler#enabledTextChecks": "sexual,discriminatory,insulting,violent,toxic,self-harm,personal",
      "scms:ModerationOperationHandler#enabledVideoChecks": "nudity,gore,wad,offensive,self-harm,gambling,tobacco",

      "scms:ModerationOperationHandler#rejectUnknownTypes": true,
      "scms:ModerationOperationHandler#validateExtensions": true,
      "scms:ModerationOperationHandler#moderateUnknownTypes": true,
      "scms:ModerationOperationHandler#moderateRdfAsText": true,

      "scms:ModerationOperationHandler#auditLogPath": "./moderation-audit.log"
    }
  ]
}

Configuration Breakdown

This configuration:

Uses full prefixed parameter names like scms:ModerationOperationHandler#nudityThreshold for explicit configuration
Imports individual CSS component files for precise control over server behavior
Uses file-based storage (file.json, pod.json, file.json for locker) for persistent data
Enables all security options to protect against MIME type bypass attacks
Configures audit logging to track all moderation decisions
Rewires the LdpHandler to route requests through the moderation handler

Threshold Values Explained

| Category | Parameter | Value | Meaning | |----------|-----------|-------|---------| | Image | nudityThreshold | 0.7 | More permissive - only blocks explicit nudity | | Image | violenceThreshold | 0.6 | Moderate strictness for gore/violence | | Image | weaponThreshold | 0.5 | Default strictness for weapons | | Image | alcoholThreshold | 0.8 | Permissive - allows most alcohol content | | Image | drugsThreshold | 0.5 | Default strictness | | Image | selfharmThreshold | 0.3 | Strict - blocks most self-harm content | | Text | textSexualThreshold | 0.5 | Default strictness | | Text | textDiscriminatoryThreshold | 0.5 | Default strictness | | Text | textInsultingThreshold | 0.5 | Default strictness | | Text | textViolentThreshold | 0.5 | Default strictness | | Text | textToxicThreshold | 0.5 | Default strictness | | Text | textSelfharmThreshold | 0.3 | Stricter for self-harm text |

Running with Full Configuration

cd /path/to/CommunitySolidServer
SIGHTENGINE_API_USER="your_user_id" \
SIGHTENGINE_API_SECRET="your_secret" \
node bin/server.js -c config/moderation.json -f ./data/ -p 3009

You should see output like:

[ModerationOperationHandler] SightEngine API configured
[ModerationOperationHandler] Configured thresholds: nudity=0.7, violence=0.6, weapon=0.5, alcohol=0.8, drugs=0.5, offensive=0.5, selfharm=0.3, gambling=0.5, tobacco=0.5
[ModerationOperationHandler] Enabled image checks: nudity, gore, wad, offensive
[ModerationOperationHandler] Enabled text checks: sexual, discriminatory, insulting, violent, toxic, self-harm, personal
[ModerationOperationHandler] Enabled video checks: nudity, gore, wad, offensive, self-harm, gambling, tobacco

Automated Testing

This project uses Jest for testing, following the same approach as the hello-world-component.

Test Structure

test/
├── config/
│   └── moderation-memory.json    # In-memory server config for testing
├── unit/
│   └── ModerationOperationHandler.test.ts   # Unit tests with mocked API
├── integration/
│   └── Server.test.ts            # Integration tests with running server
└── fixtures/
    ├── safe-image.jpg            # Test image that should pass moderation
    ├── safe-text.txt             # Test text that should pass moderation
    └── mock-responses.ts         # Mock SightEngine API responses

Running Tests

# Run all tests
npm test

# Run only unit tests
npm run test:unit

# Run only integration tests  
npm run test:integration

# Run tests with coverage
npm run test:coverage

Unit Tests

Unit tests verify the ModerationOperationHandler logic in isolation by mocking:

The SightEngine API responses
The wrapped OperationHandler
The CSS logger

Key test scenarios:

| Test Case | Description | |-----------|-------------| | Allow safe image | Image with scores below thresholds passes | | Reject unsafe image | Image with nudity score above threshold is rejected | | Allow safe text | Text without violations passes | | Reject harmful text | Text with toxic content is rejected | | Threshold configuration | Custom thresholds are applied correctly | | API error handling | Fail-open policy allows content on API errors | | Content type detection | Correct moderation type based on MIME type | | Audit logging | Log entries are written correctly | | Pass-through for GET | Non-mutating requests bypass moderation |

Example unit test pattern:

import { getLoggerFor, Logger } from '@solid/community-server';
import { ModerationOperationHandler } from '../../src/ModerationOperationHandler';

// Mock the SightEngine API and CSS logger
jest.mock('@solid/community-server', () => ({
  ...jest.requireActual('@solid/community-server'),
  getLoggerFor: jest.fn(),
}));

// Mock fetch for SightEngine API calls
global.fetch = jest.fn();

describe('ModerationOperationHandler', (): void => {
  let handler: ModerationOperationHandler;
  let mockSource: jest.Mocked<OperationHandler>;
  let logger: jest.Mocked<Logger>;

  beforeEach((): void => {
    logger = { info: jest.fn(), warn: jest.fn(), error: jest.fn() } as any;
    (getLoggerFor as jest.Mock).mockReturnValue(logger);
    
    mockSource = { canHandle: jest.fn(), handle: jest.fn() } as any;
    handler = new ModerationOperationHandler(mockSource);
  });

  it('allows safe images below threshold.', async(): Promise<void> => {
    // Mock SightEngine response with safe scores
    (global.fetch as jest.Mock).mockResolvedValue({
      ok: true,
      json: () => Promise.resolve({
        status: 'success',
        nudity: { raw: 0.1 },
        weapon: 0.05,
        alcohol: 0.02,
      }),
    });

    // Create mock operation with image content
    const operation = createMockPutOperation('image/jpeg', safeImageBuffer);
    
    await expect(handler.handle(operation)).resolves.toBeDefined();
    expect(mockSource.handle).toHaveBeenCalled();
  });

  it('rejects images above nudity threshold.', async(): Promise<void> => {
    (global.fetch as jest.Mock).mockResolvedValue({
      ok: true,
      json: () => Promise.resolve({
        status: 'success',
        nudity: { raw: 0.85 },  // Above default 0.5 threshold
      }),
    });

    const operation = createMockPutOperation('image/jpeg', unsafeImageBuffer);
    
    await expect(handler.handle(operation)).rejects.toThrow('Content rejected');
    expect(mockSource.handle).not.toHaveBeenCalled();
  });
});

Integration Tests

Integration tests verify the complete server behavior with the moderation plugin:

import { App, AppRunner, joinFilePath } from '@solid/community-server';

describe('Moderation Server', (): void => {
  let app: App;

  beforeAll(async(): Promise<void> => {
    // Create server with in-memory config for testing
    app = await new AppRunner().create({
      config: joinFilePath(__dirname, '../config/moderation-memory.json'),
      loaderProperties: {
        mainModulePath: joinFilePath(__dirname, '../../'),
        dumpErrorState: false,
      },
      shorthand: {
        port: 3456,
        loggingLevel: 'off',
      },
      variableBindings: {}
    });
    await app.start();
  });

  afterAll(async(): Promise<void> => {
    await app.stop();
  });

  it('starts successfully.', async(): Promise<void> => {
    const response = await fetch('http://localhost:3456');
    expect(response.status).toBe(200);
  });

  it('accepts safe text uploads.', async(): Promise<void> => {
    const response = await fetch('http://localhost:3456/test.txt', {
      method: 'PUT',
      headers: { 'Content-Type': 'text/plain' },
      body: 'Hello, this is a safe message.',
    });
    expect(response.status).toBe(201);
  });

  it('rejects content that violates policy.', async(): Promise<void> => {
    // This test would require mocking the SightEngine API
    // or using test content known to trigger moderation
  });
});

Test Configuration

test/config/moderation-memory.json - In-memory server config for testing:

{
  "@context": [
    "https://linkedsoftwaredependencies.org/bundles/npm/componentsjs/^5.0.0/components/context.jsonld",
    "https://linkedsoftwaredependencies.org/bundles/npm/@solid/community-server/^7.0.0/components/context.jsonld",
    {
      "scms": "https://linkedsoftwaredependencies.org/bundles/npm/@solid/content-moderation-sightengine/"
    }
  ],
  "import": [
    "css:config/app/init/initialize-root.json",
    "css:config/app/main/default.json",
    "css:config/app/variables/default.json",
    "css:config/http/handler/default.json",
    "css:config/http/middleware/default.json",
    "css:config/http/server-factory/http.json",
    "css:config/http/static/default.json",
    "css:config/identity/access/public.json",
    "css:config/identity/handler/default.json",
    "css:config/ldp/authentication/dpop-bearer.json",
    "css:config/ldp/authorization/allow-all.json",
    "css:config/ldp/handler/default.json",
    "css:config/ldp/metadata-parser/default.json",
    "css:config/ldp/metadata-writer/default.json",
    "css:config/ldp/modes/default.json",
    "css:config/storage/backend/memory.json",
    "css:config/storage/key-value/memory.json",
    "css:config/storage/middleware/default.json",
    "css:config/util/auxiliary/acl.json",
    "css:config/util/identifiers/suffix.json",
    "css:config/util/index/default.json",
    "css:config/util/logging/winston.json",
    "css:config/util/representation-conversion/default.json",
    "css:config/util/resource-locker/memory.json",
    "css:config/util/variables/default.json"
  ],
  "@graph": [
    {
      "comment": "Test moderation handler with low thresholds for testing",
      "@id": "urn:solid-server:test:ModerationHandler",
      "@type": "scms:ModerationOperationHandler",
      "scms:ModerationOperationHandler#source": {
        "@id": "urn:solid-server:default:OperationHandler"
      }
    }
  ]
}

Jest Configuration

jest.config.js:

module.exports = {
  transform: {
    '^.+\\.ts$': ['ts-jest', {
      tsconfig: 'tsconfig.json',
    }],
  },
  testRegex: '/test/(unit|integration)/.*\\.test\\.ts$',
  moduleFileExtensions: ['ts', 'js'],
  testEnvironment: 'node',
  testTimeout: 60000,  // Allow time for server startup
  collectCoverageFrom: [
    'src/**/*.ts',
    '!src/index.ts',
  ],
  coverageThreshold: {
    global: {
      branches: 80,
      functions: 80,
      lines: 80,
      statements: 80,
    },
  },
};

Mocking SightEngine API

For unit tests, mock the SightEngine API to avoid hitting the real API:

// test/fixtures/mock-responses.ts

export const mockSafeImageResponse = {
  status: 'success',
  request: { id: 'test-req-123' },
  nudity: { raw: 0.01, partial: 0.02 },
  weapon: 0.01,
  alcohol: 0.01,
  drugs: 0.01,
  gore: { prob: 0.01 },
  offensive: { prob: 0.01 },
};

export const mockUnsafeImageResponse = {
  status: 'success',
  request: { id: 'test-req-456' },
  nudity: { raw: 0.95, partial: 0.85 },
  weapon: 0.02,
  alcohol: 0.01,
};

export const mockSafeTextResponse = {
  status: 'success',
  request: { id: 'test-req-789' },
  moderation_classes: {
    sexual: 0.01,
    discriminatory: 0.01,
    insulting: 0.02,
    violent: 0.01,
    toxic: 0.01,
  },
  personal: { matches: [] },
};

export const mockToxicTextResponse = {
  status: 'success',
  request: { id: 'test-req-abc' },
  moderation_classes: {
    sexual: 0.01,
    discriminatory: 0.01,
    insulting: 0.75,
    violent: 0.02,
    toxic: 0.82,
  },
};

export const mockApiError = {
  status: 'failure',
  error: { type: 'api_error', message: 'Rate limit exceeded' },
};

Package.json Scripts

Add these scripts to package.json:

{
  "scripts": {
    "test": "jest",
    "test:unit": "jest --testPathPattern=test/unit",
    "test:integration": "jest --testPathPattern=test/integration",
    "test:coverage": "jest --coverage",
    "test:watch": "jest --watch"
  }
}

Test Coverage

The test suite provides comprehensive coverage of the moderation handler:

| Metric | Coverage | |--------|----------| | Statements | 95.53% (599/627) | | Branches | 90.40% (565/625) | | Functions | 100% (34/34) | | Lines | 95.62% (568/594) | | Total Tests | 186 |

Coverage by Feature:

| Feature | Tests | Description | |---------|-------|-------------| | Image Moderation | 15+ | All content types (JPEG, PNG, GIF, WebP, BMP) and violation categories | | Text Moderation | 15+ | Toxic, sexual, discriminatory, insulting, violent, self-harm, personal info | | Video Moderation | 20+ | Frame-based and summary-based processing, all video formats | | RDF/Linked Data | 14 | Turtle, JSON-LD, RDF/XML, SPARQL, N-Triples text extraction and moderation | | Request Methods | 10+ | PUT, POST, PATCH handling, GET/HEAD pass-through | | API Error Handling | 10+ | Fail-open policy, network errors, API failures | | Audit Logging | 5+ | Log entry creation, directory creation | | Edge Cases | 15+ | Missing data, partial responses, malformed inputs | | Configuration | 5+ | Custom thresholds, disabled checks | | Magic Byte Detection | 13 | Verifies actual file type matches claimed Content-Type | | Reject Unknown Types | 12 | Blocks unrecognized MIME types | | Extension Validation | 15 | Cross-checks file extension against Content-Type | | Moderate Unknown Types | 11 | Detects actual type via magic bytes and moderates |

Running Coverage Report:

# Generate text coverage report
npm run test:coverage

# Generate HTML coverage report
npm run test:coverage -- --coverageReporters=html

# View detailed coverage
open coverage/lcov-report/index.html

Key Lessons Learned

Use regular dependencies, not peerDependencies for @solid/community-server
Use full IRIs in components.jsonld for parameters
Use fields with keyRaw/value for passing object arguments to constructors
Import individual component files instead of aggregate configs when overriding components
Configuration priority: Components.js config > Environment variables > Defaults

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

CSS Moderation Plugin

Features

Project Structure

How It Works

Installation

Configuration

Minimal Configuration (config/moderation.json)

Configurable Parameters

Environment Variables

Security: MIME Type Bypass Protection

Security Options

Recommended Security Configuration

How It Works

RDF/Linked Data Text Moderation

Audit Logging

Audit Log Fields

Querying the Audit Log

Supported Content Types

Images

Text

Video

Testing

Manual Testing

Automated Testing

Test Structure

Test Categories

Writing New Tests

Full Configuration Example

Configuration Breakdown

Threshold Values Explained

Running with Full Configuration

Automated Testing

Test Structure

Running Tests

Unit Tests

Integration Tests

Test Configuration

Jest Configuration

Mocking SightEngine API

Package.json Scripts

Test Coverage

Key Lessons Learned