open-secure-viewer
v0.1.0
Published
Zero-cost secure document viewer — PDFium WASM + Express security gateway. Drop-in replacement for Apryse/PDFTron WebViewer with no license fees.
Downloads
30
Maintainers
Readme
Open Secure Viewer (OSV)
An in-house, zero-license-fee secure PDF viewer built on:
- PDFium WASM (
@hyzyla/pdfium) — open-source PDF rendering engine - Forked WebViewer UI — the open-source React UI layer (no Apryse license required)
- Express.js gateway — AES-256-GCM encrypted delivery, HMAC signed URLs, audit logging, session management
- Express.js converter — PDF linearization and format conversion
Architecture Overview
Browser
└── packages/viewer (port 3000) — React UI + PDFium WASM + security overlay
└── proxies /api/* and /docs/* to ↓
packages/gateway (port 4000) — Auth, doc tokens, streaming, annotations
packages/converter (port 3200) — PDF linearization, format conversionPrerequisites
| Tool | Minimum version | Install | |------|----------------|---------| | Node.js | 20.x LTS | https://nodejs.org | | npm | 10.x (bundled with Node 20) | — | | Git | any recent | https://git-scm.com | | Chrome or Edge | any recent | for Puppeteer smoke tests |
Optional for production only: PostgreSQL 15+, Redis 7+, AWS S3 or MinIO. In local dev everything runs with SQLite + in-memory Redis + local filesystem — no Docker needed.
Zero-Question First-Run Setup
Anyone cloning the repo for the first time can run a single script that handles everything automatically — no prompts, no manual .env editing:
# Windows
setup.bat
# macOS / Linux
chmod +x setup.sh && ./setup.sh
# Or directly (any platform with Node 20+)
node setup.jsThe script:
- Checks Node.js ≥ 20 and npm ≥ 10
- Installs root workspace deps (
--legacy-peer-deps) - Installs upstream WebViewer UI deps
- Generates
packages/gateway/.envwith cryptographically random secrets - Creates required data directories
- Detects optional tools (
qpdf,LibreOffice/unoconv) and reports what was found
After the script finishes, run npm run dev to start all services.
Quick Start (local dev — all three services)
1. Clone and install
git clone <repo-url> open-secure-viewer
cd open-secure-viewer
# Option A — automated (recommended for first-timers)
node setup.js # generates .env, installs deps, creates directories
# Option B — manual
npm install --legacy-peer-deps
npm run install:upstream
--legacy-peer-depsis required because the forked WebViewer UI has peer-dep conflicts that are intentionally overridden.
2. Start all services with one command
npm run devThis uses concurrently to start all three services at once:
| Service | Port | Log prefix |
|---------|------|-----------|
| Viewer dev-server | 3000 | [viewer] |
| Gateway API | 4000 | [gateway] |
| Converter | 3200 | [converter] |
First run: The viewer takes 60–120 seconds to compile the WebViewer bundle (~4 MB). The terminal shows
osv (webpack 5) compiled with N warningswhen ready.
3. Open the developer portal
http://localhost:3000/portalThe portal lists all uploaded documents and provides one-click view links with automatic dev authentication.
4. View a specific document
http://localhost:3000/?doc=<document-uuid>In dev mode, if no ?token= is provided the viewer auto-authenticates as [email protected].
Starting Services Individually
If you want to start services in separate terminals:
Terminal 1 — Gateway (API server)
npm run dev:gateway
# or
cd packages/gateway
npm run devTerminal 2 — Viewer (React UI + dev-server)
npm run dev:viewer
# or
cd packages/viewer
npm run devTerminal 3 — Converter (PDF processing)
npm run dev:converter
# or
cd packages/converter
npm run devUploading a PDF
# Upload a PDF via the gateway API (dev auto-auth)
curl -s -X POST http://localhost:4000/api/dev/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"[email protected]","role":"admin"}' | jq -r .jwt > /tmp/token.txt
curl -X POST http://localhost:4000/api/upload \
-H "Authorization: Bearer $(cat /tmp/token.txt)" \
-F "file=@/path/to/your.pdf" | jq .Or use the portal at http://localhost:3000/portal — it shows an upload button.
Environment Variables
All variables have safe defaults for local dev. Only set them when you need to override.
Gateway (packages/gateway/.env)
# Server
PORT=4000
NODE_ENV=development
# CORS — comma-separated allowed origins
APP_ORIGIN=http://localhost:3000,http://localhost:5173
# ── Database ─────────────────────────────────────────────────────────────────
# Dev default: SQLite at packages/gateway/data/dev.db (auto-created)
DATABASE_URL=sqlite://./data/dev.db
# Production: Postgres
# DATABASE_URL=postgres://user:password@localhost:5432/osv
# ── Redis ────────────────────────────────────────────────────────────────────
# Dev default: in-memory shim (no Redis needed)
# REDIS_URL=redis://localhost:6379
# Production
# REDIS_URL=redis://localhost:6379
# ── Storage ───────────────────────────────────────────────────────────────────
# Dev default: local filesystem at packages/gateway/data/storage/
STORAGE_TYPE=local
# STORAGE_LOCAL_ROOT=./data/storage
# Production S3/MinIO
# STORAGE_TYPE=s3
# S3_BUCKET=osv-documents
# S3_REGION=us-east-1
# S3_ENDPOINT=https://s3.amazonaws.com # or MinIO URL
# S3_ACCESS_KEY_ID=...
# S3_SECRET_ACCESS_KEY=...
# ── At-rest encryption for S3 ────────────────────────────────────────────────
# kms = SSE-KMS with a customer-managed CMK (recommended for prod —
# independent of bucket-access policy + CloudTrail-audited)
# s3 = SSE-S3 with S3-managed keys (free, no audit trail)
# unset = no client-side SSE request (rely on bucket default encryption)
# S3_SSE_MODE=kms
# S3_KMS_KEY_ID=arn:aws:kms:us-east-1:123456789012:key/abcd-1234
# ── Security secrets ──────────────────────────────────────────────────────────
# Dev defaults are set in code. CHANGE THESE IN PRODUCTION.
# DOC_SIGNING_SECRET=your-hmac-secret-min-32-chars
# DOC_JWT_SECRET=your-doc-jwt-secret-min-32-chars
# ── Document delivery mode ────────────────────────────────────────────────────
# plain = HMAC signed URL streaming (default, fastest)
# encrypted = AES-256-GCM end-to-end encryption
DOC_DELIVERY_MODE=plain
# ── Upload safety gates ───────────────────────────────────────────────────────
# Pre-flight content checks before files reach the converter.
# UPLOAD_MAX_BYTES = hard cap on multipart body size (default 50 MB)
# CLAMAV_BIN = scanner binary ('clamdscan' or 'clamscan')
# CLAMAV_TIMEOUT_MS = scan timeout in ms (default 30000)
# CLAMAV_REQUIRED = true = reject uploads if scanner missing (PRODUCTION).
# unset/false = allow uploads when scanner missing,
# but record MALWARE_SCAN_SKIPPED in the audit log.
# UPLOAD_MAX_BYTES=52428800
# CLAMAV_BIN=clamdscan
# CLAMAV_TIMEOUT_MS=30000
# CLAMAV_REQUIRED=true
# ── Watermark ─────────────────────────────────────────────────────────────────
# Server-side PDF watermark burn-in. on=enabled (default), off=disabled
OSV_WATERMARK_BURN=on
# ── URL base for signed stream URLs ──────────────────────────────────────────
# Leave empty in dev (returns relative /docs/... URLs, proxied by viewer)
# In production set to your public gateway URL: https://gateway.yourapp.com
GATEWAY_URL=Continuous Integration
GitHub Actions runs on every push and PR to main (.github/workflows/ci.yml):
- typecheck —
tsc --noEmitacross every workspace - regression — boots the gateway and runs
scripts/regression-test.js(covers signed URLs, anti-replay, encrypted delivery, CSP headers, upload safety gates, audit log, annotations RBAC, doc-level expiry, …)
Heavier browser-side tests (csp-test.js and multi-tab-test.js) require
the full viewer build (~3-5 min) and are gated behind a manual
workflow_dispatch trigger — kick them off from the Actions tab before a
release. Locally, with all three dev servers up:
node scripts/csp-test.js # browser-enforced CSP
node scripts/multi-tab-test.js # SingleTabEnforcer evictionViewer (packages/viewer/.env — optional)
# URL the browser uses to reach the gateway API
# Leave empty in dev — the viewer dev-server proxies /api/* to localhost:4000
REACT_APP_GATEWAY_URL=
# Gateway URL used by the viewer dev-server for its portal page
OSV_GATEWAY_URL=http://localhost:4000Converter (packages/converter/.env — optional)
PORT=3200
NODE_ENV=developmentDev Users (pre-seeded)
The gateway seeds these test users on first start:
| Email | Role | Password |
|-------|------|----------|
| [email protected] | admin | (dev only — use /api/dev/auth/login) |
| [email protected] | viewer | (dev only) |
| [email protected] | annotator | (dev only) |
| [email protected] | editor | (dev only) |
Get a JWT for any dev user:
curl -X POST http://localhost:4000/api/dev/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"[email protected]","role":"viewer"}'Dev auth endpoints (
/api/dev/*) are only available whenNODE_ENV !== 'production'.
Testing
Smoke test (headless Chrome)
# Basic — checks the viewer loads without errors
npm run check:viewer
# With auto-auth — loads a real PDF document
npm run check:viewer:auth
# With a specific document
AUTO_AUTH=1 VIEWER_DOC_ID=<uuid> node scripts/check-viewer.js http://localhost:3000/ 15000Comprehensive UI test (all toolbar features)
node scripts/ui-test.jsTests zoom in/out, search, display modes, annotation tools, undo/redo, context menu suppression, and security controls. Screenshots saved to screenshots/.
Gateway regression test
npm run regressionTests all gateway API endpoints: auth, token, stream, upload, annotations, session, audit.
Full suite
npm run regression:fullBuilding the PDFium Bridge
After modifying packages/viewer/overlay/pdfium-bridge/windowCore.js you must rebuild the bridge bundle:
# Development build (fast, with source maps)
cd packages/viewer
npm run build:pdfium:dev
# Production build (minified)
npm run build:pdfiumThe output packages/viewer/public/pdfium-core.js is served statically by the dev-server and included in the production bundle.
Production Build
# Build all packages
npm run build
# The gateway TypeScript compiles to packages/gateway/dist/
# The viewer bundles to packages/viewer/upstream/build/
# The converter TypeScript compiles to packages/converter/dist/Start production servers:
# Gateway
cd packages/gateway
node dist/index.js
# Converter
cd packages/converter
node dist/index.js
# Viewer (serve the static build behind nginx / CDN — no dev-server in prod)Project Structure
open-secure-viewer/
├── packages/
│ ├── gateway/ # Express.js API server
│ │ ├── src/
│ │ │ ├── index.ts # Entry point, middleware, CSP headers
│ │ │ ├── routes/
│ │ │ │ ├── devAuth.ts # Dev-only login endpoint
│ │ │ │ ├── docToken.ts # Issue HMAC-signed stream URLs
│ │ │ │ ├── docStream.ts # Authenticated PDF byte streaming
│ │ │ │ ├── docMeta.ts # Document metadata + list API
│ │ │ │ ├── upload.ts # PDF upload (multipart)
│ │ │ │ ├── annotations.ts # XFDF annotation persistence
│ │ │ │ ├── audit.ts # Audit log endpoint
│ │ │ │ └── session.ts # Session check / key derivation
│ │ │ └── lib/
│ │ │ ├── db.ts # SQLite (dev) / Postgres (prod)
│ │ │ ├── redis.ts # In-memory shim (dev) / Redis (prod)
│ │ │ ├── storage.ts # Local filesystem (dev) / S3 (prod)
│ │ │ ├── encryption.ts # AES-256-GCM document encryption
│ │ │ ├── watermark.ts # Server-side PDF watermark burn-in
│ │ │ └── seed.ts # Dev data seeding
│ │ └── data/ # SQLite DB + local storage (git-ignored)
│ │
│ ├── viewer/ # React UI + dev-server
│ │ ├── dev-server.js # Express dev server + portal + webpack middleware
│ │ ├── public/ # Static assets (pdfium-core.js, pdfium.wasm)
│ │ ├── overlay/ # OSV security layer (loaded before WebViewer)
│ │ │ ├── pdfium-bridge/
│ │ │ │ └── windowCore.js # window.Core implementation (PDFium WASM)
│ │ │ ├── security/
│ │ │ │ ├── boot.js # Boot: loading stages, expiry countdown, session polling
│ │ │ │ └── index.js # Keyboard/clipboard block, watermark, annotator toolbar
│ │ │ ├── api/
│ │ │ │ ├── client.js # Base HTTP client (retry, rate-limit, toast integration)
│ │ │ │ ├── gateway.js # Doc token + session API calls
│ │ │ │ ├── annotations.js # XFDF annotation CRUD
│ │ │ │ ├── documents.js # Document list, metadata, permission management
│ │ │ │ ├── admin.js # Admin ops (terminate sessions, reset slots)
│ │ │ │ └── audit.js # Fire-and-forget audit event emitter
│ │ │ └── components/
│ │ │ ├── Toast.js # Imperative toast notification system
│ │ │ ├── SessionExpired.js # Session-expired screen
│ │ │ ├── MetadataPanel.js # Collapsible document metadata panel
│ │ │ └── AnnotationSync.js # Bidirectional XFDF annotation sync
│ │ └── upstream/ # Forked WebViewer UI (submodule-style)
│ │
│ └── converter/ # PDF processing service
│ └── src/
│ └── routes/
│ └── convert.ts # PDF linearization (qpdf / pdf-lib fallback)
│
├── setup.js # Zero-question first-run setup (cross-platform)
├── setup.bat # Windows launcher for setup.js
├── setup.sh # macOS/Linux launcher for setup.js
├── scripts/
│ ├── check-viewer.js # Headless smoke test (Puppeteer)
│ ├── ui-test.js # Comprehensive UI interaction test
│ ├── regression-test.js # Gateway API regression tests
│ ├── fix-permissions.js # Dev utility: reset doc permissions
│ └── update-upstream.js # Merge upstream WebViewer UI changes
│
└── screenshots/ # Test screenshots (git-ignored)API Reference (Gateway)
| Method | Path | Auth | Description |
|--------|------|------|-------------|
| POST | /api/dev/auth/login | — | Dev-only: get JWT for test user |
| POST | /api/dev/auth/reset-slots | — | Dev-only: clear all anti-replay slots |
| POST | /api/dev/auth/grant-all | Dev JWT | Dev-only: grant view perm on all docs for all users |
| POST | /api/doc-token | JWT | Issue HMAC signed URL + doc token |
| GET | /api/docs | JWT | List accessible documents |
| GET | /api/docs/:id/metadata | JWT | Document metadata |
| POST | /api/docs/:id/grant | Admin JWT | Grant permission to a user |
| POST | /api/docs/:id/revoke | Admin JWT | Revoke permission from a user |
| GET | /docs/:id?uid=&exp=&sig= | Doc token | Stream PDF bytes (plain or AES-256-GCM) |
| POST | /api/upload | JWT (editor+) | Upload a PDF (malware scan → structure check → convert → store) |
| GET/PUT/DELETE | /api/annotations/:docId | JWT | XFDF annotation CRUD |
| POST | /api/audit | Doc token | Client-side audit event |
| GET | /api/audit/events | Admin JWT | Query audit log (admin only) |
| GET | /api/audit/recent | Dev JWT | Last N events (portal live stream) |
| GET | /api/session/check | JWT | Validate active session |
| GET | /api/session/key | Doc JWT | Derive AES-256-GCM session key (enc delivery) |
| POST | /api/session/invalidate | Admin JWT | Terminate another user's session |
| POST | /api/security/clear | — | Clear-Site-Data + service worker kill-switch |
| POST | /api/csp-report | — | CSP violation report sink |
| GET | /health | — | Health check |
Security Features
| Feature | Description |
|---------|-------------|
| HMAC signed URLs | Every stream URL is time-limited (10 min) and user-specific |
| Anti-replay (atomic) | Redis SET … GET atomic slot prevents URL farming; keyed on uid:docId:exp |
| AES-256-GCM | Optional encrypted delivery mode — PDF bytes sealed end-to-end with a per-session key |
| Server watermark | Forensic user+session watermark burned into PDF bytes before delivery |
| Client watermark | Canvas overlay watermark with user email, session ID, date |
| Anti-hotlink | Origin/Referer checked against allow-list; requests without either are fail-closed |
| Upload safety gates | MIME magic-byte validation, ClamAV malware scan, Portfolio/embedded-file/macro/JS detection |
| OCG stripping | Optional Content Groups (/OCProperties) stripped from PDFs at upload time |
| Clipboard blocked | navigator.clipboard nulled, copy keyboard shortcuts intercepted |
| Print blocked | window.print overridden, @media print CSS hides all content |
| Context menu disabled | Right-click suppressed on the viewer canvas |
| Text selection disabled | CSS user-select: none on the document body |
| Session polling | Periodic heartbeat — revoked sessions are kicked within 30 s |
| Multi-device cap | Redis-tracked concurrent session limit per user (configurable via OSV_MAX_SESSIONS) |
| JWT expiry countdown | Boot layer decodes JWT expiry and shows a live countdown from 5 min before expiry |
| HTTP security headers | CSP (per-route sandboxed for /docs/*), HSTS, X-Frame-Options, CORP, COOP, Referrer-Policy, Permissions-Policy |
| Audit logging | Every access, denial, upload, annotation change, and replay attempt is logged |
| Document expiry | Per-permission expires_at, max_views, and revoked_at |
| XFDF annotation RBAC | Annotator role can write; viewer is read-only; cross-user reads require admin |
| CSP report sink | POST /api/csp-report accepts both application/csp-report and application/reports+json |
Troubleshooting
Viewer shows "Compiling…" for a long time
Normal on first start. The WebViewer UI bundle (~4 MB) takes 60–120 seconds to compile. Wait for osv (webpack 5) compiled in the terminal.
Gateway: EADDRINUSE: address already in use :::4000
Kill the old process:
# Windows PowerShell
$pids = (netstat -ano | findstr LISTENING | findstr :4000) | ForEach-Object { ($_ -split '\s+')[-1] }
$pids | ForEach-Object { Stop-Process -Id $_ -Force -ErrorAction SilentlyContinue }Viewer: EADDRINUSE: address already in use 0.0.0.0:3000
Get-NetTCPConnection -LocalPort 3000 -ErrorAction SilentlyContinue | ForEach-Object { Stop-Process -Id $_.OwningProcess -Force -ErrorAction SilentlyContinue }Viewer stuck on "Initializing secure viewer…" with 401 Session expired
The JWT in your URL was minted against a session that no longer exists in the gateway (typical after a gateway restart, or when reusing a pasted/bookmarked URL after the Redis shim was cleared). Two recovery paths are wired up:
- Automatic (dev only) —
boot.jsdetects the 401 from/api/doc-token, decodes the JWT to extractemail, re-authenticates via/api/dev/auth/login, rewrites the URL transparently, and retries once. You'll see[OSV] Stale token detected (401). Auto-refreshed via dev-auth for <email>in the console. - Manual — open
/portaland click any user button next to the document; this always mints a fresh token via/view/:docId?as=<email>.
In production (where /api/dev/* is not mounted) the boot shows a friendly error card with a "Go to portal" button instead — the user re-authenticates through SSO and reloads.
PDF shows 403 "Document fetch failed"
This happens when the anti-replay Redis slot from a previous session is still held. The gateway restarts clear in-memory Redis. Alternatively run:
node scripts/fix-permissions.jsPDF shows 410 "Access expired"
The viewer user's document permission has an expires_at in the past (set by e2e tests). Reset it:
node scripts/fix-permissions.jsSearch crashes the viewer (infinite re-renders)
Ensure you have rebuilt the PDFium bridge after any windowCore.js changes:
cd packages/viewer
npm run build:pdfium:dev"No Chrome/Edge executable found" (smoke test)
Install Google Chrome or Microsoft Edge, or set CHROME_BIN:
CHROME_BIN="/path/to/chrome" npm run check:viewerDevelopment Workflow
1. Start services → npm run dev
2. Open portal → http://localhost:3000/portal
3. Upload or select a PDF
4. Test features in browser
5. After editing windowCore.js → npm run build:pdfium:dev (in packages/viewer)
6. After editing gateway code → nodemon auto-restarts
7. Run UI test → node scripts/ui-test.js
8. Run regression → npm run regression