npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

make-abstract

v2.7.0

Published

A CLI tool for making abstracts using AI and Zotero

Readme

make-abstract

A command-line tool for automatically generating academic abstracts from PDFs in your Zotero library and processing PDF/TXT files directly using AI. Supports both traditional abstract generation and structured screening questionnaires.

Features

  • Multiple Input Support: Process Zotero items OR PDF/TXT files directly
  • Multiple AI Providers: Supports OpenAI, Google Gemini, Anthropic Claude, Groq, DeepSeek, Cerebras, Mistral, and xAI
  • Flexible Output Modes:
    • Text Mode: Generate traditional abstracts
    • JSON Mode: Structured screening with questionnaires
    • Scan Mode: Direct PDF transcription using AI vision
    • Categorize Mode: AI-powered Zotero collection management and item categorization
  • Multiple Output Destinations: Update Zotero abstracts, create notes, print to console, or save to files
  • Batch Processing: Handle multiple items/PDFs at once
  • Custom Prompts & Screening Questions: Use your own AI prompts or screening questionnaires
  • Automatic Tag Management: Boolean screening results automatically update Zotero tags
  • Collection Management: AI-powered categorization and automatic Zotero collection creation
  • File Accumulation: Process multiple items into single output files
  • Selective abstract replacement based on pattern matching

Installation

npm install -g make-abstract

Usage

Basic Usage

Zotero Items

# Single item by key
make-abstract ABCD1234

# Multiple items by keys
make-abstract ABCD1234 EFGH5678 IJKL9012

# Using Zotero select links (automatically extracts keys)
make-abstract "zotero://select/groups/12345/items/ABCD1234"

# Mix of keys and select links
make-abstract ABCD1234 "zotero://select/groups/12345/items/EFGH5678"

PDF/TXT Files (Direct Processing)

# Single PDF file
make-abstract paper1.pdf --dest print

# Single TXT file
make-abstract document.txt --dest print

# Multiple files (mix of PDF and TXT)
make-abstract paper1.pdf document.txt paper2.pdf --dest file --out combined_results

# Mix of files and Zotero items
make-abstract ABCD1234 paper.pdf document.txt EFGH5678

Note: PDF/TXT files can only use --dest print or --dest file (not abstract or note).

Output Modes

Text Mode (Default)

Generates traditional academic abstracts:

# Generate abstract for Zotero item
make-abstract ABCD1234

# Generate abstract for PDF with custom prompt
make-abstract paper.pdf --prompt custom-prompt.txt --dest print

# Generate abstract for TXT file
make-abstract document.txt --dest print

JSON Mode

Structured screening using questionnaires:

# Screen Zotero items with questionnaire
make-abstract ABCD1234 EFGH5678 --mode json --prompt screening-questions.txt --dest file --out results

# Screen PDF files
make-abstract paper1.pdf paper2.pdf --mode json --prompt questions.txt --dest print

# Screen TXT files
make-abstract document1.txt document2.txt --mode json --prompt questions.txt --dest print

JSON Mode Requirements:

  • Must use --prompt with screening questions file
  • Automatically updates Zotero tags for boolean questions
  • Outputs structured JSON with metadata

Scan Mode

Direct PDF transcription using AI vision capabilities:

# Scan PDF for text transcription (basic)
make-abstract paper.pdf --mode scan --dest print

# Scan PDF with custom prompt
make-abstract paper.pdf --mode scan --prompt transcription-prompt.txt --dest file --out scanned_content

# Batch scan multiple PDFs
make-abstract paper1.pdf paper2.pdf paper3.pdf --mode scan --dest file --out all_scans

# Save scan results to automatically named files (result-{filename}.txt by default)
make-abstract paper.pdf --mode scan --save-scan --dest print

# Save scan with custom prompt to result files
make-abstract document1.pdf document2.pdf --mode scan --save-scan --prompt extraction-prompt.txt --dest print

# Customize the prefix for saved files
make-abstract paper.pdf --mode scan --save-scan --save-prefix "extracted-" --dest print

# Results in files like: extracted-paper.txt instead of result-paper.txt

Categorize Mode

Automatically categorize Zotero items into collections using AI:

# Categorize items using a collection structure JSON file
make-abstract categorize collections.json "zotero://select/groups/12345/items/ABCD1234" "zotero://select/groups/12345/items/EFGH5678"

# Track costs during categorization
make-abstract categorize collections.json "zotero://select/groups/12345/items/ABCD1234" --cost categorization-costs

Categorize Mode Features:

  • Collection Management: Automatically creates missing collections based on hierarchical structure
  • Smart Matching: Only creates collections that don't already exist (matches by hierarchy + name)
  • AI-Powered Categorization: Uses AI to analyze item content and suggest appropriate collections
  • Batch Processing: Categorize multiple items at once
  • Group Confirmation: Confirms target Zotero group before making changes
  • Cost Tracking: Optional cost analysis for AI categorization calls
  • Hierarchical Support: Supports nested collection structures with parent-child relationships

Collection JSON Structure:

[
  {
    "name": "Research Methods",
    "description": "Studies focusing on research methodologies and approaches",
    "children": [
      {
        "name": "Quantitative Methods",
        "description": "Studies using quantitative research approaches, statistical analysis, surveys, experiments"
      },
      {
        "name": "Qualitative Methods",
        "description": "Studies using qualitative research approaches, interviews, case studies, ethnography"
      }
    ]
  },
  {
    "name": "Subject Areas",
    "description": "Classification by academic discipline or field of study",
    "children": [
      {
        "name": "Education",
        "description": "Educational research, pedagogy, learning theories, curriculum development"
      },
      {
        "name": "Psychology",
        "description": "Psychological studies, cognitive science, behavioral research",
        "disabled": true
      }
    ]
  }
]

Collection Properties:

  • name (required): The collection name
  • description (optional): Description of what items belong in this collection
  • children (optional): Array of child collections for hierarchical structure
  • disabled (optional): When true, the collection will be created in Zotero but excluded from AI categorization prompts

Categorize Mode Process:

  1. Group Verification: Queries Zotero API for the selected group's name and asks for user confirmation
  2. Collection Sync: Fetches existing collections and creates only missing ones based on the JSON structure
  3. AI Analysis: For each Zotero item, prompts AI to analyze content and suggest appropriate collections
  4. Collection Assignment: Automatically adds items to the AI-suggested collections via Zotero API
  5. Progress Tracking: Shows real-time progress and cost information during processing

Requirements:

  • All Zotero URLs must belong to the same group
  • JSON file must follow the Collection structure format
  • Requires valid Zotero API configuration
  • Only works with Zotero items (not PDF/TXT files)

Scan Mode Features:

  • Uses AI vision to directly read PDF content (no OCR extraction first)
  • Complete document transcription: Extracts ALL readable text including headers, body text, captions, footnotes, and annotations
  • Simple table conversion: Converts tables to clean key: value format instead of complex table structures
  • Plain text output: No markdown, no table formatting, just clean readable text
  • Smart text correction: Applies OCR error correction for merged words, character substitutions, and spelling errors
  • Advanced selection detection: Correctly interprets crossed-out text as SELECTED, plus checkmarks, circles, and other markings
  • Clean organization: Natural document flow with key: value pairs for structured data and plain paragraphs for text
  • Works with image-based PDFs, scanned documents, and text PDFs
  • Only available for PDF files (not TXT files or Zotero items)
  • Can use custom prompts for specific transcription needs
  • --save-scan option automatically saves results to {prefix}{original_filename}.txt files (prefix customizable with --save-prefix)

Output Destinations

Use the --dest option to control where the output goes:

# Update the Zotero item's abstract field (default, Zotero only)
make-abstract ABCD1234 --dest abstract

# Create a note attachment in Zotero (Zotero only)
make-abstract ABCD1234 --dest note

# Print to console
make-abstract ABCD1234 --dest print

# Save to file with optional custom filename
make-abstract ABCD1234 --dest file --out my_abstract

Cost Tracking

Track token usage and costs for all API calls using the --cost option with current January 2025 pricing:

# Track costs and save to JSON file
make-abstract ABCD1234 --cost usage-report

# Works with all modes and destinations
make-abstract paper.pdf --mode scan --cost scan-costs.json
make-abstract *.pdf --mode json --prompt questions.txt --cost batch-analysis

# Combine with any other options
make-abstract ABCD1234 EFGH5678 --mode json --prompt screening.txt --dest file --out results --cost project-costs

Cost Tracking Features:

  • Complete Model Coverage: 200+ models across 8 major AI providers
  • Current 2025 Pricing: Updated with latest rates, including promotional pricing
  • Comprehensive Providers: OpenAI, Anthropic, Google Gemini, Groq, DeepSeek, xAI, Cerebras, Mistral
  • Detailed Tracking: Records all API calls with token usage (input/output/total)
  • Rich Reports: JSON exports with timestamps, model info, and cost breakdowns

Comprehensive Model Coverage (per million tokens):

OpenAI (50+ models):

  • Latest: GPT-4.1 ($2.00/$8.00), GPT-4.1-mini ($0.40/$1.60), o3 ($20.00/$80.00)
  • Flagship: GPT-4o ($2.50/$10.00), GPT-4o-mini ($0.15/$0.60), ChatGPT-4o-latest ($5.00/$15.00)
  • Reasoning: o1 ($15.00/$60.00), o1-pro ($60.00/$240.00), o1-mini ($3.00/$12.00)
  • Legacy: GPT-4 ($30.00/$60.00), GPT-3.5-turbo ($0.50/$1.50), all turbo variants

Google Gemini (40+ models):

  • Thinking Models: Gemini 2.5 Pro ($1.25/$10.00), Gemini 2.5 Flash ($0.30/$2.50)
  • Audio Models: Native Audio Dialog ($0.50/$2.00), TTS ($0.50-$1.00 input)
  • Stable: Gemini 1.5 Pro ($1.25/$5.00), Gemini 1.5 Flash ($0.075/$0.30)
  • Efficient: Gemini 2.0 Flash ($0.10/$0.40), Flash-Lite ($0.075/$0.30)
  • Free: All Gemma open models and embeddings

Anthropic (20+ models):

  • Latest: Claude 4 Opus ($20.00/$100.00), Claude 4 Sonnet ($5.00/$25.00)
  • Extended Thinking: Claude 4 Opus Extended ($30.00/$150.00)
  • Current: Claude 3.5 Sonnet ($3.00/$15.00), Claude 3.5 Haiku ($1.00/$5.00)

Other Providers:

  • DeepSeek: V3 ($0.14/$0.28), R1 Reasoner ($0.55/$2.19) - promotional pricing
  • Groq: Llama 4 Scout ($0.11/$0.34), Llama 4 Maverick ($0.20/$0.60)
  • xAI: Grok-3 ($2.00/$10.00), Grok-3-mini ($0.50/$2.50)

Cost Report Structure

{
  "totalCost": {
    "inputCost": 0.00125,
    "outputCost": 0.00240,
    "totalCost": 0.00365,
    "totalTokens": 1450,
    "totalPromptTokens": 500,
    "totalCompletionTokens": 950
  },
  "apiCalls": [
    {
      "id": "api_1704067200_abc123def",
      "timestamp": "2024-01-01T10:00:00.000Z",
      "provider": "openai",
      "model": "gpt-4o-mini",
      "operation": "generateText",
      "input": "Create a concise academic abstract...",
      "usage": {
        "promptTokens": 250,
        "completionTokens": 150,
        "totalTokens": 400
      },
      "costs": {
        "inputCost": 0.000375,
        "outputCost": 0.0009,
        "totalCost": 0.001275
      }
    }
  ],
  "metadata": {
    "generatedAt": "2024-01-01T10:05:00.000Z",
    "provider": "openai",
    "currency": "USD"
  }
}

File Output Behavior

Text Mode Files (.txt)

  • Single item: Creates new file each run
  • Multiple items: All items appended to same file with separators
  • Custom filename: --out filename (adds .txt automatically)
  • Default filename: abstract_{key}_{date}.txt

JSON Mode Files (.json)

  • Single item: Creates JSON array with one object
  • Multiple items: All items in single JSON array
  • Custom filename: --out filename (adds .json automatically)
  • Default filename: screening_{key}_{date}.json

Screening Questions

Create screening questionnaires for systematic reviews or data extraction:

Question Format

Each line: key|type|question

Supported Types:

  • boolean - True/false questions (creates Zotero tags)
  • string - Text responses
  • array - List of items
  • number - Numeric values

Example Screening File (screening-questions.txt):

hasRCT|boolean|Is this a randomized controlled trial?
sampleSize|number|What is the sample size?
methodology|string|What research methodology was used?
outcomes|array|What outcomes were measured?
hasBlinding|boolean|Was blinding used in the study?

Automatic Tag Management

Boolean questions automatically create/update Zotero tags:

  • Format: _a:{key}={value} (e.g., _a:hasRCT=true)
  • Removes old tags with same key before adding new ones
  • Only boolean questions create tags

Complete Document Processing with Scan Mode

Scan mode provides comprehensive document transcription with special attention to structured content. It extracts ALL text while intelligently processing forms, tables, and other structured elements:

Simple Processing Features:

  • Complete text extraction: Captures all headers, body text, captions, footnotes, and annotations
  • Key-value table conversion: Converts tables to simple key: value format instead of complex structures
  • Plain text output: No markdown formatting, no table structures, just clean readable text
  • Form data extraction: Converts form fields to key: value pairs
  • Multiple input types: Handles text boxes, checkboxes, radio buttons, dropdown selections
  • Smart selection detection:
    • Crossed-out text = SELECTED (important: crossed out means chosen, not ignored)
    • Checkmarks (✓, ✗, ✔) = SELECTED
    • Circled or highlighted options = SELECTED
    • X marks in boxes = SELECTED
    • Identifies final answers when multiple options are marked
    • Converts handwriting to readable text
    • Shows selections as key: selected_option format
  • OCR error correction: Fixes merged words, character substitutions, and spelling errors
  • Clean organization: Logical reading order with simple key: value pairs and plain paragraphs

Example Document Processing:

# Complete document transcription with enhanced table/form processing
make-abstract research-paper.pdf --mode scan --dest print

# Process mixed content (text + forms + tables) with comprehensive extraction
make-abstract complex-document.pdf --mode scan --dest file --out complete-transcription

# Use specialized prompt for specific document types
make-abstract technical-report.pdf --mode scan --prompt custom-extraction-prompt.txt --dest file --out extracted-data

The scan mode provides complete document transcription with simple key: value output for structured data:

Student Name: SYNCLAIRE CHELIMO
Age: 17
Gender: FEMALE (crossed out = selected)
Has Disabilities: NO (checkmark = selected)
Preferred Subject: Mathematics (circled = selected)
Grade/Form: 1234567
School Type: PUBLIC

This is an example of how regular paragraph text would appear in the output. All text content is preserved and presented in clean, readable format. The scan mode correctly identifies crossed-out text, checkmarks, and other markings as selections rather than corrections.

Custom Prompts

Use your own AI prompt from a text file:

# Text mode with custom prompt
make-abstract ABCD1234 --prompt my-prompt.txt

# JSON mode with screening questions (required)
make-abstract ABCD1234 --mode json --prompt screening-questions.txt

Example Custom Prompt (example-prompt.txt):

Create a detailed academic abstract for the following research paper. The abstract should be structured with the following elements:

1. Background/Context: Briefly explain the research problem or gap
2. Objective: State the main research question or hypothesis  
3. Methods: Describe the methodology or approach used
4. Results: Summarize the key findings
5. Conclusions: Highlight the main implications and contributions

The abstract should be approximately 200-250 words and written in a formal academic tone.

IMPORTANT: Respond with ONLY the abstract text. Do not include section headers or formatting.

Example Form Extraction Prompt (form-extraction-prompt.txt):

You are an expert form processing system. Analyze this document and extract all form data with high accuracy.

FORM ANALYSIS GUIDELINES:

1. FIELD IDENTIFICATION:
   - Detect all form fields, labels, and input areas
   - Recognize various input types: text boxes, checkboxes, radio buttons, dropdowns
   - Identify required vs optional fields

2. RESPONSE EXTRACTION:
   - Text fields: Extract all typed or handwritten content
   - Single-choice questions: Identify the ONE selected option (ignore crossed-out marks)
   - Multiple-choice: List ALL selected options
   - Yes/No questions: Provide clear "YES" or "NO" based on checkmarks
   - Numerical fields: Extract exact numbers (grades, scores, IDs, phone numbers)

3. INTELLIGENT INTERPRETATION:
   - Crossed-out text = IGNORE (these are mistakes/corrections)
   - Multiple marks on single-choice = Choose the clearest/final mark
   - Handwriting: Convert to most likely intended text
   - Empty fields: Mark as "Not filled" or "N/A"

4. OUTPUT FORMAT - Present as a clean table:
| Question/Field | Response | Notes |
|----------------|----------|-------|
| [Field name]   | [Answer] | [Any clarifications] |

5. QUALITY CHECKS:
   - Verify logical consistency between related fields
   - Flag any unclear or ambiguous responses in Notes column
   - Ensure all visible form fields are captured

CRITICAL: Focus on ACCURACY. Extract only final, intended responses. Ignore stray marks and corrections.

Complete Examples

# Traditional abstract generation
make-abstract ABCD1234 EFGH5678 --dest file --prompt academic-prompt.txt

# Systematic review screening
make-abstract ABCD1234 EFGH5678 --mode json --prompt screening-questions.txt --dest file --out review_screening

# Mixed processing (files and Zotero items)
make-abstract ABCD1234 paper1.pdf document.txt paper2.pdf --dest print

# Create notes with screening results
make-abstract ABCD1234 --mode json --prompt questions.txt --dest note

# Batch file processing to single file
make-abstract *.pdf *.txt --mode json --prompt screening.txt --dest file --out all_papers

# PDF transcription with AI vision
make-abstract scanned-document.pdf --mode scan --dest file --out transcribed_text

# Custom transcription for specific document types
make-abstract table-heavy-document.pdf --mode scan --prompt table-extraction-prompt.txt --dest print

# Form processing with specialized prompt
make-abstract survey-form.pdf --mode scan --prompt form-extraction-prompt.txt --dest file --out form-data

# Comprehensive OCR text correction
make-abstract ocr-text-with-errors.txt --dest print

# Cost tracking examples
make-abstract ABCD1234 --cost usage-report.json
make-abstract paper1.pdf paper2.pdf --mode scan --cost scan-analysis --dest file --out transcripts

# Save scan results to automatically named files
make-abstract report.pdf research-paper.pdf --mode scan --save-scan --dest print

# Customize file prefixes for saved files
make-abstract paper.pdf --mode scan --save-scan --save-prefix "scan-" --dest print
make-abstract ABCD1234 --mode json --prompt screening.txt --dest note --save-json --save-prefix "screening-"

# Process TXT files for abstract generation
make-abstract research-notes.txt manuscript-draft.txt --dest file --out text-abstracts

# Mix of PDF and TXT files with JSON screening
make-abstract paper.pdf notes.txt appendix.pdf --mode json --prompt screening.txt --dest file --out mixed-analysis

# Process multiple forms with intelligent extraction
make-abstract student-form1.pdf student-form2.pdf --mode scan --save-scan --dest print

# Batch form processing with custom prompt
make-abstract survey-*.pdf --mode scan --prompt form-extraction-prompt.txt --dest file --out all-forms

JSON Output Structure

JSON mode produces structured output with metadata:

[
  {
    "zotero_id": "ABCD1234",
    "date_generated": "2024-01-15T10:30:00.000Z", 
    "model": "gpt-4o-mini",
    "prompt": "hasRCT|boolean|Is this a randomized controlled trial?...",
    "result": {
      "hasRCT": true,
      "sampleSize": 150,
      "methodology": "Double-blind randomized controlled trial",
      "outcomes": ["primary efficacy", "safety", "quality of life"],
      "hasBlinding": true
    }
  }
]

For PDF/TXT files, zotero_id will be null.

Initial Setup

On first run, the tool will guide you through configuration:

  1. Choose your AI provider (OpenAI, Gemini, Anthropic, Groq, DeepSeek, Cerebras, Mistral, or xAI)
  2. Enter your API key for the chosen provider
  3. Enter your Zotero API key (if processing Zotero items)
  4. Enter your Zotero group ID (if processing Zotero items)
  5. Optionally configure the AI model

Note: For PDF/TXT-only usage, only AI configuration is required.

AI Provider and Model Selection

Listing and Selecting Models

Use the models command to view available models for each provider and select one:

# Interactive provider and model selection
make-abstract models

# List models for a specific provider
make-abstract models openai
make-abstract models anthropic
make-abstract models groq

# Show help for models command
make-abstract models --help

The models command will:

  1. Show all available models for the selected provider
  2. Let you select a model interactively
  3. Update your configuration with the selected model
  4. Optionally set the provider as your active AI provider

Supported Providers and Models (Updated June 2025)

OpenAI

  • GPT-4 series: gpt-4o, gpt-4o-mini, gpt-4o-audio-preview, gpt-4-turbo, gpt-4
  • GPT-4.1 series: gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
  • GPT-4.5 preview: gpt-4.5-preview
  • Reasoning models: o1, o1-mini, o1-preview, o1-pro, o3, o3-mini, o4-mini

Anthropic Claude

  • Claude 4: claude-4-opus, claude-4-sonnet
  • Claude 3.7: claude-3-7-sonnet
  • Claude 3.5: claude-3-5-sonnet-v2, claude-3-5-sonnet, claude-3-5-haiku
  • Claude 3: claude-3-opus, claude-3-sonnet, claude-3-haiku

Google Gemini

  • Gemini 2.5: gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite-preview-06-17
  • Gemini 2.0: gemini-2.0-flash, gemini-2.0-flash-lite, gemini-2.0-flash-preview-image-generation
  • Gemini 1.5: gemini-1.5-flash, gemini-1.5-flash-8b, gemini-1.5-pro

Groq

  • Llama models: llama-3.3-70b-versatile, llama-3.1-70b-versatile, llama-3.1-8b-instant
  • Mixtral: mixtral-8x7b-32768
  • Gemma: gemma-7b-it, gemma2-9b-it

DeepSeek

  • Latest: deepseek-v3, deepseek-r1, deepseek-r1-lite-preview
  • Specialized: deepseek-prover-v2
  • Legacy: deepseek-v2.5, deepseek-v2, deepseek-coder

Cerebras

  • Llama models: llama-3.3-70b, llama-3.1-70b, llama-3.1-8b, llama-3-70b, llama-3-8b

Mistral

  • Latest: mistral-large-2411, mistral-small-3.1-2503, codestral-2501
  • Specialized: mistral-ocr-2505, mistral-nemo
  • Legacy: mixtral-8x7b-instruct, mixtral-8x22b-instruct

xAI Grok

  • Grok 3: grok-3-beta, grok-3-mini-beta
  • Grok 2: grok-2-beta, grok-2-mini-beta
  • Grok 1: grok-beta

Configuration

The tool stores configuration in your user directory. Available settings:

  • aiProvider: AI service to use ('openai', 'gemini', 'anthropic', 'groq', 'deepseek', 'cerebras', 'mistral', 'xai')
  • openaiApiKey: Your OpenAI API key
  • geminiApiKey: Your Google Gemini API key
  • anthropicApiKey: Your Anthropic API key
  • groqApiKey: Your Groq API key
  • deepseekApiKey: Your DeepSeek API key
  • cerebrasApiKey: Your Cerebras API key
  • mistralApiKey: Your Mistral API key
  • xaiApiKey: Your xAI API key
  • zoteroApiKey: Your Zotero API key
  • zoteroGroupId: Your Zotero group ID
  • openaiModel: OpenAI model to use (default: "gpt-4o")
  • geminiModel: Gemini model to use (default: "gemini-2.5-flash")
  • anthropicModel: Anthropic model to use
  • groqModel: Groq model to use
  • deepseekModel: DeepSeek model to use
  • cerebrasModel: Cerebras model to use
  • mistralModel: Mistral model to use
  • xaiModel: xAI model to use
  • temperature: Generation temperature (default: 0.7, range: 0-1)
  • replacePattern: Regex pattern to match abstracts that should be replaced

Configuration management:

# Set a configuration value
make-abstract config set <key> <value>

# Examples of setting replace patterns:
make-abstract config set replacePattern "/^AI-generated:/"  # Matches abstracts starting with "AI-generated:"
make-abstract config set replacePattern "[AUTO]"           # Matches abstracts containing exactly "[AUTO]"
make-abstract config set replacePattern ".*"              # Matches any abstract (allows replacement of all)

# View current configuration (API keys will be masked)
make-abstract config list

# Delete configuration
make-abstract config delete <key>

Command Reference

make-abstract [options] <inputs...>

Arguments:
  inputs                    Zotero item keys/select links OR PDF/TXT file paths

Options:
  --dest <destination>      Output destination: abstract, note, print, file (default: "abstract")
  --prompt <file>          Path to text file with custom prompt or screening questions
  --out <filename>         Output filename when using --dest file (optional)
  --mode <mode>            Output mode: text, json, scan (default: "text")
  --cost <filename>        Save token usage and cost analysis to JSON file
  --tagprefix <prefix>     Prefix for screening tags (default: "_a:")
  --save-json             Save JSON result for each Zotero item (json mode only)
  --save-scan             Save scan result to file named result_{ORIGINAL FILE NAME}.txt (scan mode only)
  --save-prefix <prefix>  Prefix for saved files from --save-json and --save-scan options (default: "result-")
  --keysfile <file>       Path to text file containing Zotero keys (one per line)
  -h, --help               Display help for command

Commands:
  models [provider]        List and select models for AI providers
  config                   Manage configuration

OCR Text Correction

The tool includes intelligent OCR text processing that comprehensively corrects errors to make text as legible as possible:

What it fixes:

  • Merged words: "researchersconducted" → "researchers conducted"
  • Character substitutions: "efective" → "effective", "recieve" → "receive"
  • Common OCR errors: "rn" → "m", "cl" → "d", "vv" → "w"
  • Spelling mistakes: "introasting" → "interesting"
  • Missing spaces and punctuation where context is clear
  • Grammar errors caused by OCR corruption

Examples of Comprehensive Correction:

Input:  "The researchersconducted anexperiment tomeasure theefectiveness ofthe treatrnent."
Output: "The researchers conducted an experiment to measure the effectiveness of the treatment."

Input:  "Participantswere randornlyassigned tocontrol andtreatrnent groupsfor cornparison."
Output: "Participants were randomly assigned to control and treatment groups for comparison."

Input:  "The finclings suggest that this approach is very efective for introasting results."
Output: "The findings suggest that this approach is very effective for interesting results."

Applies to all modes:

  • Text mode: Abstracts generated from fully corrected text
  • JSON mode: Screening questions answered using corrected text
  • Scan mode: PDF transcription with comprehensive error correction

File Processing Notes

  • PDF/TXT files are processed directly without Zotero integration
  • Text extraction uses pdf-parse library for PDF files in regular text mode
  • TXT files are read directly as plain text
  • Scan mode uses AI vision for direct PDF processing with OCR enhancement (PDF only)
  • Only --dest print and --dest file are supported for PDF/TXT files
  • No tag management for PDF/TXT files (tags are Zotero-specific)

Error Handling

The tool continues processing if individual items fail:

  • Invalid Zotero keys are skipped with error messages
  • PDF/TXT files that can't be read are skipped
  • Failed AI requests are logged but don't stop batch processing
  • Configuration errors halt execution with helpful messages

Development

  1. Clone the repository
  2. Install dependencies:
    npm install
  3. Run in development mode:
    npm run dev
  4. Build the project:
    npm run build

License

MIT