@yfedoseev/meta-oxide-wasm
v0.1.1
Published
WebAssembly bindings for MetaOxide - Universal metadata extraction library
Maintainers
Readme
MetaOxide WASM
Universal metadata extraction library for JavaScript/TypeScript via WebAssembly
High-performance WASM bindings for the MetaOxide Rust library. Extract all major metadata formats from HTML with blazing-fast performance in any JavaScript runtime.
✨ Features
- 🚀 Blazing Fast - Native Rust performance via WebAssembly
- 🌍 Universal - Works in browsers, Node.js, Deno, edge computing platforms
- 📦 Zero Dependencies - Self-contained WASM module
- 🔒 Type Safe - Complete TypeScript definitions included
- 🎯 11 Metadata Formats - Comprehensive extraction support
- ⚡ Edge Ready - Optimized for Cloudflare Workers, Vercel Edge, etc.
- 🧪 Well Tested - 30+ tests ensuring correctness
📦 Installation
npm install @yfedoseev/meta-oxide-wasmOr with your preferred package manager:
yarn add @yfedoseev/meta-oxide-wasm
pnpm add @yfedoseev/meta-oxide-wasm🚀 Quick Start (5 Minutes)
Browser
<!DOCTYPE html>
<html>
<head>
<title>MetaOxide Demo</title>
</head>
<body>
<script type="module">
import { extractAll } from 'https://unpkg.com/@yfedoseev/meta-oxide-wasm';
const html = document.documentElement.outerHTML;
const metadata = await extractAll(html, {
baseUrl: window.location.href
});
console.log('Open Graph:', metadata.openGraph);
console.log('Twitter Card:', metadata.twitter);
console.log('JSON-LD:', metadata.jsonLd);
</script>
</body>
</html>Node.js
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
const html = await fetch('https://example.com').then(r => r.text());
const metadata = await extractAll(html, { baseUrl: 'https://example.com' });
console.log(metadata.openGraph?.title);
console.log(metadata.meta?.description);TypeScript
import { extractAll, type ExtractionResult } from '@yfedoseev/meta-oxide-wasm';
async function getMetadata(url: string): Promise<ExtractionResult> {
const response = await fetch(url);
const html = await response.text();
return extractAll(html, { baseUrl: url });
}
const metadata = await getMetadata('https://example.com');
console.log(metadata.openGraph?.image);📚 API Reference
Initialization
import { initialize } from '@yfedoseev/meta-oxide-wasm';
// Initialize WASM module (automatic on first use)
await initialize();The WASM module is automatically initialized on first extraction function call. Manual initialization is optional but recommended if you want to control timing.
Extract All Formats
function extractAll(
html: string,
options?: { baseUrl?: string }
): Promise<ExtractionResult>Extract all 11 metadata formats in a single pass. Most efficient for comprehensive extraction.
Example:
const metadata = await extractAll(html, { baseUrl: 'https://example.com' });
// Access different formats
metadata.meta // HTML meta tags
metadata.openGraph // Open Graph
metadata.twitter // Twitter Card
metadata.jsonLd // JSON-LD
metadata.microdata // Microdata
metadata.microformats // Microformats
metadata.rdfa // RDFa
metadata.dublinCore // Dublin Core
metadata.manifest // Web App Manifest
metadata.oembed // oEmbed
metadata.relLinks // rel-* linksIndividual Extractors
Extract specific metadata formats:
// HTML meta tags
const meta = await extractMeta(html, { baseUrl });
// { description: "...", keywords: "...", ... }
// Open Graph
const og = await extractOpenGraph(html, { baseUrl });
// { title: "...", type: "website", image: "...", ... }
// Twitter Card
const twitter = await extractTwitter(html, { baseUrl });
// { card: "summary_large_image", site: "@example", ... }
// JSON-LD
const jsonLd = await extractJsonLd(html, { baseUrl });
// { items: [{ "@type": "Article", ... }] }
// Microdata
const microdata = await extractMicrodata(html, { baseUrl });
// { items: [{ type: ["https://schema.org/Person"], ... }] }
// Microformats
const microformats = await extractMicroformats(html, { baseUrl });
// { items: [{ type: ["h-card"], properties: {...} }] }
// RDFa
const rdfa = await extractRDFa(html, { baseUrl });
// { triples: [{ subject: "...", predicate: "...", object: "..." }] }
// Dublin Core
const dc = await extractDublinCore(html, { baseUrl });
// { title: "...", creator: "...", date: "..." }
// Web App Manifest
const manifest = await extractManifest(html, { baseUrl });
// { href: "/manifest.json", type: "application/manifest+json" }
// oEmbed
const oembed = await extractOEmbed(html, { baseUrl });
// { json: "https://...", xml: "https://..." }
// rel-* links
const relLinks = await extractRelLinks(html, { baseUrl });
// { canonical: [{ href: "..." }], alternate: [...] }🌐 Platform Support
Browser Support
Works in all modern browsers with WebAssembly support:
| Browser | Version | Support | |---------|---------|---------| | Chrome | 57+ | ✅ | | Firefox | 52+ | ✅ | | Safari | 11+ | ✅ | | Edge | 16+ | ✅ | | Opera | 44+ | ✅ |
Runtime Support
| Runtime | Support | Notes | |---------|---------|-------| | Node.js | 18+ | ✅ Full support | | Deno | 1.0+ | ✅ Native TypeScript | | Bun | 1.0+ | ✅ Fast runtime | | Cloudflare Workers | ✅ | Edge computing | | Vercel Edge | ✅ | Serverless edge | | Netlify Edge | ✅ | Edge functions |
💡 Usage Examples
1. Extract Metadata from Current Page
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
const metadata = await extractAll(
document.documentElement.outerHTML,
{ baseUrl: window.location.href }
);
// Display Open Graph image
if (metadata.openGraph?.image) {
const img = document.createElement('img');
img.src = metadata.openGraph.image;
document.body.appendChild(img);
}2. Build a Link Preview Generator
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
interface LinkPreview {
title: string;
description: string;
image?: string;
siteName?: string;
}
async function generatePreview(url: string): Promise<LinkPreview> {
const response = await fetch(url);
const html = await response.text();
const metadata = await extractAll(html, { baseUrl: url });
return {
title: metadata.openGraph?.title ||
metadata.twitter?.title ||
metadata.meta?.title ||
'Untitled',
description: metadata.openGraph?.description ||
metadata.twitter?.description ||
metadata.meta?.description ||
'',
image: metadata.openGraph?.image ||
metadata.twitter?.image,
siteName: metadata.openGraph?.site_name,
};
}
const preview = await generatePreview('https://github.com');
console.log(preview);3. SEO Analysis Tool
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
interface SEOReport {
score: number;
issues: string[];
recommendations: string[];
}
async function analyzeSEO(html: string, url: string): Promise<SEOReport> {
const metadata = await extractAll(html, { baseUrl: url });
const issues: string[] = [];
const recommendations: string[] = [];
let score = 100;
// Check meta description
if (!metadata.meta?.description) {
issues.push('Missing meta description');
score -= 10;
} else if (metadata.meta.description.length < 120) {
recommendations.push('Meta description should be at least 120 characters');
score -= 5;
}
// Check Open Graph
if (!metadata.openGraph?.title) {
issues.push('Missing Open Graph title');
score -= 15;
}
if (!metadata.openGraph?.image) {
recommendations.push('Add Open Graph image for better social sharing');
score -= 10;
}
// Check Twitter Card
if (!metadata.twitter?.card) {
recommendations.push('Add Twitter Card metadata');
score -= 10;
}
// Check structured data
if (!metadata.jsonLd?.items?.length && !metadata.microdata?.items?.length) {
recommendations.push('Add structured data (JSON-LD or Microdata)');
score -= 15;
}
return { score: Math.max(0, score), issues, recommendations };
}
const report = await analyzeSEO(html, 'https://example.com');
console.log(`SEO Score: ${report.score}/100`);4. Metadata Cache Service
import { extractAll, type ExtractionResult } from '@yfedoseev/meta-oxide-wasm';
class MetadataCache {
private cache = new Map<string, { data: ExtractionResult; expires: number }>();
private ttl = 3600000; // 1 hour
async get(url: string): Promise<ExtractionResult> {
// Check cache
const cached = this.cache.get(url);
if (cached && cached.expires > Date.now()) {
return cached.data;
}
// Fetch and extract
const response = await fetch(url);
const html = await response.text();
const data = await extractAll(html, { baseUrl: url });
// Cache result
this.cache.set(url, {
data,
expires: Date.now() + this.ttl,
});
return data;
}
clear(): void {
this.cache.clear();
}
}
const cache = new MetadataCache();
const metadata = await cache.get('https://example.com');⚡ Edge Computing Examples
Cloudflare Workers
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
export default {
async fetch(request: Request): Promise<Response> {
const url = new URL(request.url);
const targetUrl = url.searchParams.get('url');
if (!targetUrl) {
return new Response('Missing url parameter', { status: 400 });
}
const response = await fetch(targetUrl);
const html = await response.text();
const metadata = await extractAll(html, { baseUrl: targetUrl });
return new Response(JSON.stringify(metadata), {
headers: { 'Content-Type': 'application/json' },
});
},
};Vercel Edge Functions
// api/metadata.ts
import { extractAll } from '@yfedoseev/meta-oxide-wasm';
export const config = { runtime: 'edge' };
export default async function handler(request: Request) {
const { searchParams } = new URL(request.url);
const url = searchParams.get('url');
if (!url) {
return new Response('Missing url parameter', { status: 400 });
}
const response = await fetch(url);
const html = await response.text();
const metadata = await extractAll(html, { baseUrl: url });
return new Response(JSON.stringify(metadata), {
headers: { 'Content-Type': 'application/json' },
});
}Deno Deploy
import { extractAll } from 'https://esm.sh/@yfedoseev/meta-oxide-wasm';
Deno.serve(async (request: Request) => {
const url = new URL(request.url);
const targetUrl = url.searchParams.get('url');
if (!targetUrl) {
return new Response('Missing url parameter', { status: 400 });
}
const response = await fetch(targetUrl);
const html = await response.text();
const metadata = await extractAll(html, { baseUrl: targetUrl });
return new Response(JSON.stringify(metadata), {
headers: { 'Content-Type': 'application/json' },
});
});🎯 Supported Metadata Formats
1. HTML Meta Tags
Standard <meta> tags with name, property, or http-equiv attributes.
<meta name="description" content="Page description">
<meta name="keywords" content="keyword1, keyword2">2. Open Graph Protocol
Facebook's Open Graph metadata for rich social sharing.
<meta property="og:title" content="Page Title">
<meta property="og:image" content="https://example.com/image.jpg">3. Twitter Cards
Twitter's metadata format for card-based sharing.
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@username">4. JSON-LD
Google's preferred structured data format.
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Article Title"
}
</script>5. Microdata
HTML5 inline structured data.
<div itemscope itemtype="https://schema.org/Person">
<span itemprop="name">John Doe</span>
</div>6. Microformats
Classic web microformats (h-card, h-entry, h-event, etc.).
<div class="h-card">
<a class="p-name u-url" href="https://example.com">John Doe</a>
</div>7. RDFa
Resource Description Framework in Attributes.
<div vocab="https://schema.org/" typeof="Article">
<span property="headline">Title</span>
</div>8. Dublin Core
Standard metadata vocabulary for digital resources.
<meta name="DC.title" content="Document Title">
<meta name="DC.creator" content="Author Name">9. Web App Manifest
Progressive Web App manifest discovery.
<link rel="manifest" href="/manifest.json">10. oEmbed
Embeddable content discovery.
<link rel="alternate" type="application/json+oembed"
href="https://example.com/oembed?url=...">11. rel-* Links
Link relationships (canonical, alternate, prev, next, etc.).
<link rel="canonical" href="https://example.com/page">
<link rel="alternate" hreflang="es" href="https://example.com/es">⚡ Performance
MetaOxide WASM is optimized for speed:
- Typical page: <10ms extraction time
- Complex page (multiple formats): <50ms
- WASM overhead: <1ms initialization (cached)
- Memory efficient: Minimal allocations, zero-copy parsing
Benchmark Comparison
| Library | Extraction Time | WASM Size | Notes | |---------|----------------|-----------|-------| | MetaOxide WASM | ~5ms | ~500KB | This library | | metascraper | ~50ms | N/A | Node.js only | | html-metadata | ~30ms | N/A | Node.js only | | open-graph-scraper | ~40ms | N/A | Node.js only |
Benchmarks performed on a typical blog post with Open Graph, Twitter Cards, and JSON-LD.
🔧 Building from Source
Prerequisites
- Rust 1.70+ with wasm32 target
- Node.js 18+
- wasm-pack
Build Steps
# Clone the repository
git clone https://github.com/yfedoseev/meta-oxide.git
cd meta-oxide/bindings/wasm
# Install dependencies
npm install
# Build WASM module
npm run build
# Run tests
npm test
# Build for all targets
npm run build:allBuild Targets
npm run build- Web target (ESM)npm run build:nodejs- Node.js target (CommonJS)npm run build:bundler- Bundler target (Webpack, Rollup, etc.)
🐛 Troubleshooting
WASM initialization fails in browser
Problem: WebAssembly.instantiate failed
Solution: Ensure your server sends correct MIME type for .wasm files:
Content-Type: application/wasmFor development servers:
// Vite
export default {
server: {
fs: { allow: ['..'] }
}
}
// Webpack
module: {
rules: [{
test: /\.wasm$/,
type: 'webassembly/async'
}]
}Module not found in Node.js
Problem: Cannot find module '@yfedoseev/meta-oxide-wasm'
Solution: Use Node.js 18+ with ESM support:
{
"type": "module"
}Or use dynamic import:
const { extractAll } = await import('@yfedoseev/meta-oxide-wasm');TypeScript errors
Problem: Type errors when importing
Solution: Ensure moduleResolution: "node" in tsconfig.json:
{
"compilerOptions": {
"moduleResolution": "node",
"esModuleInterop": true
}
}Performance issues
Problem: Slow extraction on large HTML
Solution:
- Extract only needed formats (use individual extractors)
- Limit HTML size before extraction
- Consider streaming parser for very large documents
- Cache extraction results
// Extract only what you need
const og = await extractOpenGraph(html);
const twitter = await extractTwitter(html);📊 Browser Compatibility
Check if WebAssembly is supported:
if (typeof WebAssembly === 'object') {
// WASM is supported
const metadata = await extractAll(html);
} else {
// Fallback to alternative solution
console.error('WebAssembly not supported');
}🤝 Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
Development Setup
git clone https://github.com/yfedoseev/meta-oxide.git
cd meta-oxide/bindings/wasm
npm install
npm testRunning Examples
# Browser example
npm run example:browser
# Node.js example
node examples/node-wasm.js https://example.com
# Deno example
deno run --allow-net examples/deno.ts https://example.com📄 License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
🔗 Links
🙏 Acknowledgments
Built with:
- wasm-bindgen - Rust/WASM bindings
- scraper - HTML parsing
- serde - Serialization
📮 Support
- 📧 Email: [email protected]
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
Made with ❤️ by Yury Fedoseev
