@fast-scrape/wasm
v0.2.0
Published
WebAssembly bindings for scrape-rs HTML parsing library
Readme
@fast-scrape/wasm
10-50x faster HTML parsing in the browser. Native-speed parsing via WebAssembly.
Installation
npm install @fast-scrape/wasmyarn add @fast-scrape/wasm
pnpm add @fast-scrape/wasm
bun add @fast-scrape/wasmQuick start
import init, { Soup } from '@fast-scrape/wasm';
await init(); // Initialize WASM module (once)
const soup = new Soup("<html><body><div class='content'>Hello, World!</div></body></html>");
console.log(soup.find("div").text); // Hello, World![!IMPORTANT] Call
init()once before using any other functions.
Usage
import init, { Soup } from '@fast-scrape/wasm';
await init();
const soup = new Soup(html);
// Find first element by tag
const div = soup.find("div");
// Find all elements
const divs = soup.findAll("div");
// CSS selectors
for (const el of soup.select("div.content > p")) {
console.log(el.text);
}Vite:
import init, { Soup } from '@fast-scrape/wasm';
await init(); // Vite handles WASM automaticallyWebpack 5:
// webpack.config.js
module.exports = {
experiments: { asyncWebAssembly: true },
};<script type="module">
import init, { Soup } from 'https://esm.sh/@fast-scrape/wasm';
await init();
const soup = new Soup('<div>Hello</div>');
console.log(soup.find('div').text);
</script>import init, { Soup, Tag } from '@fast-scrape/wasm';
await init();
function extractLinks(soup: Soup): string[] {
return soup.select("a[href]").map(a => a.getAttribute("href") ?? "");
}Bundle size
v0.2.0 optimization brings package to under 500 KB:
| Build | Size | |-------|------| | Minified + gzip | ~150 KB | | Minified | ~400 KB |
[!TIP] SIMD enabled automatically on Chrome 91+, Firefox 89+, Safari 16.4+. v0.2.0 includes zero-copy serialization for 50-70% memory savings in HTML extraction.
Browser support
| Browser | Version | SIMD | |---------|---------|------| | Chrome | 80+ | 91+ | | Firefox | 75+ | 89+ | | Safari | 13+ | 16.4+ | | Edge | 80+ | 91+ |
Built on Servo
Powered by battle-tested libraries from the Servo browser engine: html5ever (HTML5 parser) and selectors (CSS selector engine).
Related packages
| Platform | Package |
|----------|---------|
| Rust | scrape-core |
| Python | fast-scrape |
| Node.js | @fast-scrape/node |
License
MIT OR Apache-2.0
