npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ujeebu-org/langchain

v0.2.1

Published

LangChain integration for Ujeebu Extract API

Readme

LangChain Ujeebu Integration (Node.js/TypeScript)

npm version License: MIT Node.js >= 16

Official LangChain integration for Ujeebu Extract API - Extract clean, structured content from news articles and blog posts for use with Large Language Models (LLMs) and AI applications.

Features

  • Easy Integration: Seamlessly integrate Ujeebu Extract API with LangChain agents and chains
  • Document Loaders: Load articles as LangChain Documents for use with vector stores and retrievers
  • Agent Tools: Use Ujeebu Extract as a tool in LangChain agents
  • Rich Metadata: Extract article text, HTML, author, publication date, images, and more
  • Quick Mode: Optional fast extraction mode (30-60% faster)
  • TypeScript Support: Full TypeScript types and interfaces

What is Ujeebu Extract?

Ujeebu Extract converts news and blog articles into clean, structured JSON data. It extracts:

  • Clean article text and HTML
  • Author and publication date
  • Title and summary
  • Images and media
  • RSS feeds
  • Site metadata

Perfect for RAG (Retrieval-Augmented Generation) applications, content analysis, and LLM training data.

Installation

npm install @ujeebu-org/langchain
# or
yarn add @ujeebu-org/langchain
# or
pnpm add @ujeebu-org/langchain

Requirements

  • Node.js 16.0 or higher
  • @langchain/core 0.3.0 or higher
  • An Ujeebu API key (Get one here)

Quick Start

Set up your API key

export UJEEBU_API_KEY="your-api-key"

Or set it programmatically:

process.env.UJEEBU_API_KEY = "your-api-key";

Using as an Agent Tool

npm install @ujeebu-org/langchain @langchain/core @langchain/openai @langchain/langgraph langchain
import { UjeebuExtractTool } from '@ujeebu-org/langchain';
import { createAgent } from 'langchain';
import { ChatOpenAI } from '@langchain/openai';

// Initialize the tool
const ujeebuTool = new UjeebuExtractTool();

// Create an agent
const model = new ChatOpenAI({ temperature: 0 });
const agent = createAgent({ model, tools: [ujeebuTool] });

// Use the agent
const response = await agent.invoke({
  messages: [{ role: 'user', content: 'Extract the article from https://example.com/article and summarize it' }],
});
console.log(response.messages[response.messages.length - 1].content);

Using the Document Loader

npm install @ujeebu-org/langchain @langchain/core @langchain/openai @langchain/community
import { UjeebuLoader } from '@ujeebu-org/langchain';
import { FaissStore } from '@langchain/community/vectorstores/faiss';
import { OpenAIEmbeddings } from '@langchain/openai';

// Load articles
const loader = new UjeebuLoader({
  urls: [
    'https://example.com/article1',
    'https://example.com/article2',
    'https://example.com/article3'
  ]
});
const documents = await loader.load();

// Create a vector store
const embeddings = new OpenAIEmbeddings();
const vectorStore = await FaissStore.fromDocuments(documents, embeddings);

// Query the documents
const results = await vectorStore.similaritySearch('What are the main topics?');

Usage Examples

Basic Article Extraction

import { UjeebuExtractTool } from '@ujeebu-org/langchain';

const tool = new UjeebuExtractTool();
const result = await tool.invoke({
  url: 'https://example.com/article',
  text: true,
  author: true,
  pub_date: true
});
console.log(result);

Extract with Images

import { UjeebuExtractTool } from '@ujeebu-org/langchain';

const tool = new UjeebuExtractTool();
const result = await tool.invoke({
  url: 'https://example.com/article',
  images: true  // Extract article images
});

Quick Mode for Faster Extraction

import { UjeebuLoader } from '@ujeebu-org/langchain';

const loader = new UjeebuLoader({
  urls: ['https://example.com/article'],
  quickMode: true  // 30-60% faster, slightly less accurate
});
const documents = await loader.load();

Load with HTML Content

import { UjeebuLoader } from '@ujeebu-org/langchain';

const loader = new UjeebuLoader({
  urls: ['https://example.com/article'],
  extractHtml: true,   // Include HTML content
  extractImages: true  // Include images
});
const documents = await loader.load();

// Access metadata
const doc = documents[0];
console.log(`Title: ${doc.metadata.title}`);
console.log(`Author: ${doc.metadata.author}`);
console.log(`Images: ${doc.metadata.images}`);

Build a QA System

import { UjeebuLoader } from '@ujeebu-org/langchain';
import { FaissStore } from '@langchain/community/vectorstores/faiss';
import { OpenAIEmbeddings, ChatOpenAI } from '@langchain/openai';
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';

// Load articles
const loader = new UjeebuLoader({
  urls: [
    'https://example.com/article1',
    'https://example.com/article2'
  ]
});
const documents = await loader.load();

// Create vector store
const embeddings = new OpenAIEmbeddings();
const vectorStore = await FaissStore.fromDocuments(documents, embeddings);
const retriever = vectorStore.asRetriever();

// Retrieve and answer
const query = 'What are the main points?';
const relevantDocs = await retriever.invoke(query);
const context = relevantDocs.map((doc) => doc.pageContent).join('\n\n');

const prompt = ChatPromptTemplate.fromMessages([
  ['system', 'Answer based on the following context:\n\n{context}'],
  ['human', '{question}'],
]);

const chain = prompt.pipe(new ChatOpenAI({ temperature: 0 })).pipe(new StringOutputParser());
const answer = await chain.invoke({ context, question: query });
console.log(answer);

API Reference

UjeebuExtractTool

A LangChain tool for extracting article content.

Constructor Parameters:

  • apiKey (string, optional): Ujeebu API key. Defaults to UJEEBU_API_KEY environment variable.
  • baseUrl (string, optional): Custom API endpoint URL.

Tool Parameters (UjeebuExtractInput):

  • url (string, required): URL of the article to extract
  • text (boolean, optional): Extract article text (default: true)
  • html (boolean, optional): Extract article HTML (default: false)
  • author (boolean, optional): Extract article author (default: true)
  • pub_date (boolean, optional): Extract publication date (default: true)
  • images (boolean, optional): Extract images (default: false)
  • quick_mode (boolean, optional): Use quick mode for faster extraction (default: false)

UjeebuLoader

A LangChain document loader for articles.

Constructor Parameters (UjeebuLoaderParams):

  • urls (string[], required): List of article URLs to load
  • apiKey (string, optional): Ujeebu API key
  • extractText (boolean, optional): Extract article text (default: true)
  • extractHtml (boolean, optional): Extract article HTML (default: false)
  • extractAuthor (boolean, optional): Extract author (default: true)
  • extractPubDate (boolean, optional): Extract publication date (default: true)
  • extractImages (boolean, optional): Extract images (default: false)
  • quickMode (boolean, optional): Use quick mode (default: false)
  • baseUrl (string, optional): Custom API endpoint URL

Methods:

  • load(): Promise<Document[]> - Load all documents

Document Metadata:

  • source: Original URL
  • url: Resolved URL
  • canonical_url: Canonical URL
  • title: Article title
  • author: Article author
  • pub_date: Publication date
  • language: Article language
  • site_name: Site name
  • summary: Article summary
  • image: Main image URL
  • images: List of all image URLs (if extractImages=true)

Advanced Usage

Custom API Endpoint

import { UjeebuLoader } from '@ujeebu-org/langchain';

const loader = new UjeebuLoader({
  urls: ['https://example.com/article'],
  baseUrl: 'https://custom-api.ujeebu.com/extract'
});

Error Handling

import { UjeebuLoader } from '@ujeebu-org/langchain';

const loader = new UjeebuLoader({
  urls: ['https://example.com/article']
});

try {
  const documents = await loader.load();
  console.log(`Loaded ${documents.length} documents`);
} catch (error) {
  if (error instanceof Error) {
    console.error(`Error: ${error.message}`);
  }
}

Testing

Run the test suite:

# Install dependencies
npm install

# Build the package
npm run build

# Run tests
npm test

# Run tests with coverage
npm run test:coverage

# Run linting
npm run lint

# Format code
npm run format

Examples

Check out the examples directory for more usage examples:

Pricing

Ujeebu Extract API pricing is based on usage. Check the pricing page for details.

Support

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Related Projects

  • LangChain - Build applications with LLMs through composability
  • Ujeebu API - Web scraping and content extraction API

Changelog

0.2.0

  • Upgrade to LangChain 1.x compatibility
  • Import BaseDocumentLoader from @langchain/core instead of langchain
  • Only @langchain/core required as peer dependency (not langchain)
  • Update examples to use @langchain/langgraph for agents

0.1.0

  • Initial release
  • UjeebuExtractTool for LangChain agents
  • UjeebuLoader document loader
  • Full TypeScript support
  • Comprehensive test coverage
  • Complete documentation