pixa-content
v0.1.1
Published
Secure content processing for PIXA blockchain — Markdown/HTML rendering, sanitization, metadata parsing, summarization
Readme
pixa-content
Secure content processing engine for the PIXA blockchain platform. Rust → WebAssembly module for browser-side rendering of posts, comments, profiles, and metadata.
Features
| Feature | Description |
|---------|-------------|
| Post Rendering | Markdown/HTML → sanitized HTML with full formatting |
| Comment Rendering | Stricter subset — no headings, tables, iframes |
| @Mentions | @username → <a href="/@username"> with validation |
| #Hashtags | #tag → <a href="/trending/tag"> |
| Image Extraction | Returns image list; render with/without images |
| Base64 Images | Validates data:image/* URIs, blocks dangerous MIME types |
| External Links | Marks external links with data-* attrs for React dialog |
| Biography | HTML → plain text sanitization |
| Username | HIVE-compatible validation (3-16 chars, a-z0-9.-) |
| Metadata | Flexible JSON parsing for PIXA/HIVE/STEEM variants |
| Plain Text | Strip all formatting, return clean text |
| Summarization | TF-IDF extractive summarization |
| XSS Protection | Whitelist-based sanitization via ammonia |
Security Model
- Whitelist-based HTML sanitization via
ammonia— only approved tags and attributes pass through - No script execution —
<script>, event handlers (onclick,onerror, etc.), andjavascript:URIs are all blocked - SVG safety — Base64 SVGs are decoded and checked for embedded scripts
- Link isolation — External links get
rel="noopener noreferrer"andtarget="_blank" - Input limits — Maximum body length, image size, nesting depth, and URL length enforced
- Username validation — Strict HIVE-compatible rules prevent injection via mention links
Build
Prerequisites
# Rust toolchain
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup target add wasm32-unknown-unknown
# wasm-pack
cargo install wasm-pack
# Optional: binaryen for extra ~20-30% size reduction
npm install -g binaryenBuild Commands
# Standard release build (optimized for size)
./scripts/build.sh release
# Smaller build with wee_alloc (trades speed for ~20KB less)
./scripts/build.sh release-small
# Dev build (fast compile, no optimization)
./scripts/build.sh dev
# For bundlers (webpack, vite, rollup)
./scripts/build.sh bundler
# Run all tests
./scripts/build.sh testExpected Output Size
| Build | Raw WASM | + wasm-opt | Gzip |
|-------|----------|------------|------|
| release | ~800KB-1.2MB | ~600KB-900KB | ~250KB-350KB |
| release-small | ~750KB-1.1MB | ~550KB-850KB | ~220KB-320KB |
Why not smaller? The bulk comes from
ammonia(HTML sanitizer usinghtml5everparser) andregex. These are essential for security. The gzipped transfer size is what matters for users — and 250-350KB gzipped is reasonable for a full content engine replacing multiple JavaScript libraries.
Usage
JavaScript / React
import { PixaContent } from './pixa-content.js';
// Initialize once
const pixa = await PixaContent.init('./pixa_content.js');
// ── Render a post ──────────────────────────────
const result = pixa.renderPostBody(postBody, {
include_images: true,
max_image_count: 0, // 0 = unlimited
internal_domains: ['pixa.pics', 'custom-domain.com'],
});
console.log(result.html); // Sanitized HTML
console.log(result.images); // [{ src, alt, is_base64, index }]
console.log(result.links); // [{ href, text, domain, is_external }]
// ── Render without images (for previews) ───────
const preview = pixa.renderPostBody(postBody, { include_images: false });
// preview.images still has the list, but HTML has no <img> tags
// ── Render a comment ───────────────────────────
const comment = pixa.renderCommentBody(commentBody);
// ── Extract plain text ─────────────────────────
const text = pixa.extractPlainText(postBody);
// ── Summarize ──────────────────────────────────
const summary = pixa.summarizeContent(postBody, 3);
console.log(summary.summary); // Top 3 sentences joined
console.log(summary.keywords); // Top keywords with scores
console.log(summary.sentences); // Scored sentences with positions
// ── Parse metadata ─────────────────────────────
const meta = pixa.parseMetadata(jsonMetadataString);
console.log(meta.profile); // { name, about, profile_image, ... }
console.log(meta.tags); // ['tag1', 'tag2']
console.log(meta.extra); // Any PIXA-specific extension fields
// ── Sanitize profile data ──────────────────────
const bio = pixa.sanitizeBiography(rawBio, 256); // max 256 chars
const name = pixa.sanitizeUsername(rawUsername); // '' if invalidExternal Link Dialog (React)
import { ExternalLinkDialog } from './ExternalLinkDialog';
function PostContent({ html }) {
return (
<ExternalLinkDialog
onNavigate={(href, domain) => {
console.log('User navigated to:', domain);
}}
>
<div dangerouslySetInnerHTML={{ __html: html }} />
</ExternalLinkDialog>
);
}Or with a custom dialog:
<ExternalLinkDialog
renderDialog={({ href, domain, onConfirm, onCancel }) => (
<YourCustomModal
title={`Leave Pixa?`}
message={`Navigate to ${domain}?`}
onYes={onConfirm}
onNo={onCancel}
/>
)}
>
<div dangerouslySetInnerHTML={{ __html: html }} />
</ExternalLinkDialog>Using from Rust (Non-WASM)
use pixa_content::{render_post_body, render_comment_body, extract_plain_text};
use pixa_content::types::RenderOptions;
let opts = RenderOptions::default();
let result = render_post_body("# Hello @world\n\nCheck #pixelart!", &opts);
println!("HTML: {}", result.html);
println!("Images: {:?}", result.images);
println!("Links: {:?}", result.links);Architecture
pixa-content/
├── Cargo.toml # Dependencies & WASM optimization config
├── src/
│ ├── lib.rs # Public API & WASM bindings
│ ├── types.rs # Shared types (RenderResult, ImageInfo, etc.)
│ ├── sanitizer.rs # ammonia-based HTML sanitization
│ ├── mentions.rs # @mention and #hashtag processing
│ ├── images.rs # Image extraction, base64 validation
│ ├── links.rs # External link detection & wrapping
│ ├── text.rs # Plain text extraction, sentence splitting
│ ├── summarizer.rs # TF-IDF extractive summarization
│ └── metadata.rs # JSON metadata parsing (PIXA/HIVE/STEEM)
├── tests/
│ └── integration.rs # Full pipeline integration tests
├── js/
│ ├── pixa-content.js # JS wrapper API
│ └── ExternalLinkDialog.jsx # React external link component
├── scripts/
│ └── build.sh # Build & optimization script
└── README.mdProcessing Pipeline
Input (Markdown or HTML)
│
├──► Detect format (is_predominantly_html)
│
├──► If Markdown: pulldown-cmark → HTML
│
├──► Extract images (before sanitization)
│
├──► Process @mentions and #hashtags
│ (text nodes only, not inside existing links)
│
├──► Sanitize HTML (ammonia whitelist)
│ ├── Post: full formatting
│ └── Comment: restricted subset
│
├──► Process links (internal vs external)
│ └── External: add data-* attrs for React
│
├──► Optionally strip/limit images
│
└──► Return { html, images, links }Metadata Flexibility
The metadata parser handles all known HIVE/STEEM/PIXA fields with proper typing, while preserving unknown fields in an extra map for forward compatibility:
{
"profile": { "name": "...", "about": "...", ... },
"tags": ["tag1", "tag2"],
"pixa_nft_id": "preserved-in-extra",
"custom_field": { "also": "preserved" }
}URL fields in profiles handle the __url__ underscore wrapping pattern found in some HIVE metadata.
License
Proprietary — Pixagram SA. All rights reserved.
