@cognipeer/to-markdown
v2.0.1
Published
A versatile utility library for converting various file formats to Markdown. Now with TypeScript support!
Readme
@cognipeer/to-markdown
A versatile, TypeScript-first utility library for converting various file formats to Markdown.
✨ Features
- 🎯 Multiple Format Support: Convert PDF, DOCX, HTML, Excel, CSV, and more
- 📦 Simple API: Easy to use with Promise-based interface
- 🔧 TypeScript First: Written in TypeScript with full type definitions
- 🚀 Fast & Efficient: Optimized for performance with modular architecture
- 📚 Well Documented: Comprehensive documentation with examples
- 🎨 Customizable: Options to control conversion behavior
📦 Installation
npm install @cognipeer/to-markdownUsing other package managers:
# Yarn
yarn add @cognipeer/to-markdown
# pnpm
pnpm add @cognipeer/to-markdown🔧 Development
Building from Source
# Install dependencies
npm install
# Build TypeScript and bundles
npm run build
# Watch mode for development
npm run devScripts
npm run build- Build TypeScript and create bundlesnpm run build:ts- Compile TypeScript onlynpm run build:rollup- Create rollup bundles onlynpm run clean- Remove dist directorynpm run dev- Watch mode for TypeScript compilation
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📝 Changelog
Version 2.0.0 (Latest)
- ✨ Rewritten in TypeScript with full type definitions
- 🏗️ Modular architecture with separate converter modules
- 📚 Comprehensive documentation with GitHub Pages
- 💡 Added usage examples
- 🎯 Improved error handling
- 📦 Better package exports (ESM + CJS)
Version 1.0.1
- Initial release with JavaScript implementation
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
👤 Author
Cognipeer
- GitHub: @Cognipeer
- npm: @cognipeer
🙏 Acknowledgments
Built with these amazing libraries:
- pdf2md - PDF parsing
- mammoth - DOCX conversion
- turndown - HTML to Markdown
- xlsx - Excel parsing
- cheerio - HTML parsing
- sharp - Image processing
🔗 Links
Made with ❤️ by Cognipeer
🚀 Quick Start
Basic Usage
import { convertToMarkdown, saveToMarkdownFile } from "@cognipeer/to-markdown";
// Convert from file path
const markdown = await convertToMarkdown("/path/to/document.docx");
console.log(markdown);
// Convert from buffer
const buffer = fs.readFileSync("document.pdf");
const markdown = await convertToMarkdown(buffer, {
fileName: "document.pdf",
});
console.log(markdown);
// Convert from base64 string
const base64Content = "data:application/pdf;base64,JVBERi0xLjUNCiW...";
const markdown = await convertToMarkdown(base64Content);
console.log(markdown);
// Save converted markdown to a file
await saveToMarkdownFile(markdown, "converted-document", "./output");TypeScript Usage
import {
convertToMarkdown,
saveToMarkdownFile,
type ConverterOptions,
type ConverterInput
} from "@cognipeer/to-markdown";
// Type-safe conversion
const options: ConverterOptions = {
fileName: "document.pdf",
forceExtension: ".pdf"
};
const input: ConverterInput = "./document.pdf";
const result: string = await convertToMarkdown(input, options);📖 API Reference
convertToMarkdown(input, options?)
Converts various file formats to Markdown.
Parameters:
input: ConverterInput- File path (string), base64 data (string), or Bufferoptions?: ConverterOptions- Optional configurationfileName?: string- Name of the file (helpful for buffer inputs)forceExtension?: string- Force a specific file extension for processingurl?: string- Original URL (used for web content like YouTube or Bing search)
Returns: Promise<string> - The converted markdown content
Example:
const markdown = await convertToMarkdown("./document.pdf", {
forceExtension: ".pdf"
});saveToMarkdownFile(content, fileName, outputDir?)
Saves the markdown content to a file.
Parameters:
content: string- The markdown content to savefileName: string- Name for the output file (without .md extension)outputDir?: string- Directory to save the file (defaults to "output")
Returns: Promise<string> - Path to the saved file
Example:
const filePath = await saveToMarkdownFile(markdown, "document", "./output");
console.log(`Saved to: ${filePath}`);📚 Documentation
For comprehensive documentation, please visit our documentation site.
💡 Examples
Check out the examples/ directory for more usage examples:
- Basic Usage - File path, buffer, and base64 conversions
- Spreadsheet Conversion - Excel and CSV to Markdown tables
- Advanced Usage - Jupyter notebooks and complex HTML
Running Examples
# Using tsx (recommended for development)
npx tsx examples/basic-usage.ts
# Or build and run
npm run build
node dist/examples/basic-usage.js🏗️ Project Structure
to-markdown/
├── src/
│ ├── converters/ # Format-specific converters
│ │ ├── pdf.ts
│ │ ├── docx.ts
│ │ ├── html.ts
│ │ └── ...
│ ├── types/ # TypeScript type definitions
│ │ └── index.ts
│ ├── utils/ # Utility functions
│ │ ├── markdown.ts
│ │ └── fileDetection.ts
│ └── index.ts # Main entry point
├── examples/ # Usage examples
├── docs/ # GitHub Pages documentation
├── dist/ # Compiled output
└── package.json- Web content: Special handling for YouTube videos and Bing search results
Examples
Convert PDF to Markdown
import { convertToMarkdown } from "@cognipeer/to-markdown";
import fs from "fs";
const pdfBuffer = fs.readFileSync("document.pdf");
const markdown = await convertToMarkdown(pdfBuffer, {
fileName: "document.pdf",
});
console.log(markdown);Convert DOCX to Markdown
import { convertToMarkdown } from "@cognipeer/to-markdown";
const markdown = await convertToMarkdown("/path/to/document.docx");
console.log(markdown);Convert HTML to Markdown
import { convertToMarkdown, saveToMarkdownFile } from "@cognipeer/to-markdown";
import fs from "fs";
const htmlContent = fs.readFileSync("page.html", "utf-8");
const markdown = await convertToMarkdown(htmlContent, {
forceExtension: ".html",
});
console.log(markdown);Convert and Save to File
import { convertToMarkdown, saveToMarkdownFile } from "@cognipeer/to-markdown";
const markdown = await convertToMarkdown("/path/to/document.pdf");
const savedPath = await saveToMarkdownFile(
markdown,
"converted-document",
"./output"
);
console.log(`Saved to: ${savedPath}`);License
MIT
