html-ast-query
v2.0.0
Published
A TypeScript library to fetch HTML, parse it into AST, query and manipulate it using custom React hooks with iframe support
Maintainers
Readme
HTML AST Query Library
A powerful library to fetch HTML from URLs, parse it into an Abstract Syntax Tree (AST), and query/manipulate it using an intuitive API with React custom hooks.
Features
- Fetch HTML from any URL and parse into AST
- Query AST nodes using CSS-selector-like syntax
- Update values dynamically based on queries
- Parse JavaScript inside
<script>tags using Acorn - Convert modified AST back to HTML string
- React custom hooks for easy integration
- Support for parent/sibling navigation
Installation
npm install html-ast-query react react-domCore API
Parsing HTML
import { parseHtmlUrl, parseHtmlString, fromAST } from 'html-ast-query';
// Parse from URL
const ast = await parseHtmlUrl('https://example.com/page.html');
// Parse from string
const ast = parseHtmlString('<div><p>Hello</p></div>');
// Convert back to HTML
const html = fromAST(ast);Querying AST
import { queryAST } from 'html-ast-query';
const q = queryAST(ast);
// Find nodes by type
q.find({ type: 'Element' });
// Find nodes by attributes
q.find({ where: { src: 'image.png' } });
// Find by value in JS AST
q.find({ where: { value: 'image' } });
// Chain queries
q.find({ where: { tag: 'img' } }).select('src').value();Updating Values
// Update literal values
q.find({ where: { value: 'old-text' } }).update('new-text');
// Update specific property
q.find({ where: { tag: 'img' } }).select('src').update('new-image.png');TypeScript Support
Built-in TypeScript definitions with full type safety for React and TSX.
import { useHtmlAST, ASTRenderer } from 'html-ast-query';
import type { HTMLASTNode, RenderOptions, QueryDefinition } from 'html-ast-query';
function App() {
const { ast, loading } = useHtmlAST('https://example.com');
if (loading) return <div>Loading...</div>;
return <ASTRenderer ast={ast} />;
}Vite Setup
Install as a dependency in your Vite project:
npm install html-ast-queryUse in your TSX files:
import { useIframeAST, queryAST } from 'html-ast-query';
// Fetch, modify, and render in iframe
function MyComponent() {
const { iframeRef } = useIframeAST('https://example.com', [
{ find: { where: { tag: 'title' } }, select: 'value', update: 'Modified!' }
]);
return <iframe ref={iframeRef} />;
}TSX Rendering
Render parsed HTML directly as React elements instead of strings.
ASTRenderer Component
import { ASTRenderer } from 'html-ast-query/renderer';
function MyComponent() {
const { ast } = useHtmlAST('https://example.com');
return (
<div className="content">
<ASTRenderer ast={ast} />
</div>
);
}Custom Components
Map HTML tags to React components:
import { ASTRenderer } from 'html-ast-query/renderer';
const CustomLink = ({ href, children }: { href: string; children: React.ReactNode }) => (
<a href={href} className="custom-link">{children}</a>
);
function App() {
const { ast } = useHtmlAST('https://example.com');
return (
<ASTRenderer
ast={ast}
options={{
components: {
a: CustomLink,
img: LazyImage,
},
skipScripts: true,
skipComments: true,
}}
/>
);
}Hook Version
import { useASTRenderer } from 'html-ast-query/renderer';
function MyComponent() {
const { ast } = useHtmlAST('https://example.com');
const elements = useASTRenderer(ast, { skipScripts: true });
return <div>{elements}</div>;
}React Hooks
useIframeHtml
Fetch HTML and render it inside an <iframe> with full document isolation.
import { useIframeHtml } from 'html-ast-query';
function IframeViewer() {
const { iframeRef, loading, error } = useIframeHtml('https://example.com');
return (
<div>
{loading && <p>Loading...</p>}
{error && <p>Error: {error}</p>}
<iframe ref={iframeRef} style={{ width: '100%', height: '500px' }} />
</div>
);
}Options:
onLoad- Callback when iframe loadssandbox- Sandbox attribute string
useIframeAST
Fetch HTML as AST, apply queries, and render modified HTML in an iframe.
import { useIframeAST } from 'html-ast-query';
function EditableIframe() {
const queries = [
{
find: { type: 'Element', where: { tag: 'h1' } },
select: 'value',
update: 'New Heading'
}
];
const { iframeRef, loading, error, applyQueries } = useIframeAST(
'https://example.com',
queries
);
return (
<div>
{loading && <p>Loading...</p>}
{error && <p>Error: {error}</p>}
<iframe ref={iframeRef} style={{ width: '100%', height: '500px' }} />
<button onClick={() => applyQueries()}>Apply Changes</button>
</div>
);
}useHtmlAST
Fetch and parse HTML from a URL.
import { useHtmlAST } from 'html-ast-query';
function MyComponent() {
const { ast, loading, error, html, refresh } = useHtmlAST('https://example.com/page.html');
if (loading) return <div>Loading...</div>;
if (error) return <div>Error: {error}</div>;
return (
<div>
<iframe srcDoc={html} />
<button onClick={refresh}>Refresh</button>
</div>
);
}Returns:
ast- Parsed AST objectastRef- React ref to the AST (for mutations)loading- Boolean loading stateerror- Error message if failedrefresh- Function to re-fetchhtml- HTML string from AST
useASTQuery
Query and manipulate an existing AST.
import { useHtmlAST, useASTQuery } from 'html-ast-query';
function MyComponent() {
const { ast, astRef, html } = useHtmlAST('https://example.com/page.html');
const { find, select, update, getHtml } = useASTQuery(astRef);
const handleUpdate = () => {
// Find image URLs and update them
update(
{ where: { value: 'image' } },
'data',
'https://new-image-url.com/image.png'
);
// Get updated HTML
const newHtml = getHtml();
// console.log(newHtml);
};
return (
<div>
<button onClick={handleUpdate}>Update Images</button>
</div>
);
}Returns:
find(query)- Find nodes matching queryselect(key)- Select property from nodesupdate(query, key, value)- Update valuesgetHtml()- Get current HTML stringgetAST()- Get current AST
useHtmlQuery
Combined hook for fetching HTML and applying queries automatically.
import { useHtmlQuery } from 'html-ast-query';
function MyComponent() {
const queries = [
{
find: { where: { value: 'image' } },
select: 'data',
update: 'https://new-image.com/banana.webp'
},
{
find: { where: { tag: 'title' } },
select: 'value',
update: 'New Title'
}
];
const { html, loading, error, applyQueries } = useHtmlQuery(
'https://example.com/page.html',
queries
);
if (loading) return <div>Loading...</div>;
if (error) return <div>Error: {error}</div>;
return (
<div>
<iframe srcDoc={html} height="500" width="500" />
<button onClick={() => applyQueries()}>Reapply Queries</button>
</div>
);
}Parameters:
url- HTML URL to fetchqueries- Array of query objects to applyoptions- Additional options (e.g.,prepare: true)
Query Object Format:
{
find: { where: { key: 'value' } }, // Find criteria
select: 'propertyName', // Property to select
update: 'newValue' // New value to set
}Returns:
ast- Modified AST (or original if no queries)originalAst- Original unmodified ASTloading- Boolean loading stateerror- Error messagehtml- Final HTML stringapplyQueries- Function to manually reapply queriesrefresh- Function to re-fetch HTML
Query API Reference
find(query)
Find nodes matching criteria.
// By node type
.find({ type: 'Element' })
.find({ type: 'Script' })
.find({ type: 'Literal' })
// By properties
.find({ where: { tag: 'div' } })
.find({ where: { src: 'image.png' } })
// By JS AST value
.find({ where: { value: 'some-string' } })select(key)
Select a property from matched nodes.
.select('src') // Get src attribute
.select('children') // Get children array
.select('value') // Get value propertyupdate(value)
Update the value of selected nodes.
.update('new-value')value()
Get primitive values from Literal nodes.
.find({ type: 'Literal' }).value() // Returns actual valuesparent()
Get parent nodes.
.find({ where: { tag: 'img' } }).parent()siblings()
Get sibling nodes.
.find({ where: { tag: 'p' } }).siblings()Advanced Example
import { useHtmlQuery } from 'html-ast-query';
function EditablePage() {
const [queries, setQueries] = useState([
{
find: { where: { value: 'MainImages_1' } },
select: 'data',
update: 'https://example.com/new-image.webp'
}
]);
const { html, loading, applyQueries } = useHtmlQuery(
'https://cdn.example.com/template.html',
queries
);
const addImageUpdate = () => {
setQueries(prev => [
...prev,
{
find: { where: { value: 'MainImages_2' } },
select: 'data',
update: 'https://example.com/another-image.webp'
}
]);
};
if (loading) return <div>Loading template...</div>;
return (
<div>
<iframe srcDoc={html} frameBorder="0" height="600" width="800" />
<button onClick={addImageUpdate}>Add Image Update</button>
<button onClick={() => applyQueries()}>Apply Changes</button>
</div>
);
}How It Works
- Fetch - Uses
fetch()to get HTML from URL - Parse - Uses DOMParser to parse HTML into DOM
- Convert to AST - Recursively converts DOM nodes to AST:
- Elements →
{ type: 'Element', tag, attrs, children } - Text →
{ type: 'Text', value } - Comments →
{ type: 'Comment', value } - Scripts →
{ type: 'Script', attrs, jsAST, jsCode } - Styles →
{ type: 'Style', attrs, content }
- Elements →
- Query - Traverse AST to find/update nodes
- Render - Convert AST back to HTML string
Dependencies
acorn- JavaScript parser for script tagsastring- JavaScript code generatorreact- Peer dependency for hooks
License
ISC
