url-to-json-markdown
v1.0.7
Published
A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.
Maintainers
Readme
url-to-json-markdown
A TypeScript library that fetches URLs and converts them to structured JSON and Markdown format.
Built by 16x Writer and 16x Eval team.
Installation
npm install url-to-json-markdownUsage
import { urlToJsonMarkdown } from 'url-to-json-markdown';
// Reddit post (using fallback without credentials)
const post = await urlToJsonMarkdown(
'https://www.reddit.com/r/example/comments/12345/title/'
);
console.log(post.title); // "Post Title"
console.log(post.content); // "# Post Title\n\nPost content...\n\nby _username_ (↑ 123) 12/25/2024"
console.log(post.type); // "reddit"
// Reddit post with credentials (more reliable)
const postWithCreds = await urlToJsonMarkdown(
'https://www.reddit.com/r/example/comments/12345/title/',
{
clientId: 'your_client_id',
clientSecret: 'your_client_secret',
}
);
// Reddit post with comments included
const postWithComments = await urlToJsonMarkdown(
'https://www.reddit.com/r/example/comments/12345/title/',
{
clientId: 'your_client_id',
clientSecret: 'your_client_secret',
includeComments: true,
}
);
// Will include "## Comments" section with tree-structured comments
// Reddit comment
const comment = await urlToJsonMarkdown(
'https://www.reddit.com/r/example/comments/12345/comment/abc123/'
);
console.log(comment.title); // "First line of comment..."
console.log(comment.content); // "# Comment by username\n\nComment text...\n\nby _username_ (↑ 45)"
console.log(comment.type); // "reddit"
// Reddit comment with child comments/replies
const commentWithReplies = await urlToJsonMarkdown(
'https://www.reddit.com/r/example/comments/12345/comment/abc123/',
{ includeComments: true }
);
// Will include "## Replies" section with tree-structured child comments
// Generic web page
const webpage = await urlToJsonMarkdown('https://example.com/article');
console.log(webpage.title); // "Article Title"
console.log(webpage.content); // "# Article Title\n\nMain content as markdown..."
console.log(webpage.type); // "generic"API
urlToJsonMarkdown(url: string, options?: RedditOptions): Promise<UrlToJsonResult>
Parameters:
url- The URL to fetch and convertoptions- Optional Reddit configuration
Reddit Options:
interface RedditOptions {
clientId?: string;
clientSecret?: string;
includeComments?: boolean;
}clientId&clientSecret- Reddit API credentials for more reliable accessincludeComments- Include comments in a tree structure (Reddit posts) or child comments/replies (Reddit comments)
Return Type:
interface UrlToJsonResult {
title: string;
content: string;
type: 'reddit' | 'generic';
}Reddit Access
For Reddit URLs, the library supports two modes:
Fallback mode (no credentials): Uses browser user agent to access Reddit's public JSON API. May be subject to rate limiting.
Authenticated mode (with credentials): Uses Reddit OAuth API for more reliable access. Requires Reddit app credentials.
To get Reddit credentials:
- Go to https://www.reddit.com/prefs/apps
- Create a new app (script type)
- Use the client ID and secret
Supported URLs
- Reddit: Posts and comments from reddit.com
- Generic: Any website with automatic content extraction
