phylo-cli

v1.0.2

Published

3 months ago

Config-driven AI file processor CLI - process markdown files with OpenAI, Claude, Gemini and more

0High
0Medium
0Low

tonyhicks20

ai cli processor openai anthropic claude gemini llm batch-processing markdown file-processing automation

phylo-cli

Config-driven AI file processor CLI - process markdown files with OpenAI, Claude, Gemini, and more

A simple, powerful command-line tool for batch processing markdown files using AI models. Perfect for analyzing journals, transforming documents, generating summaries, and automating content workflows.

Features

🤖 Multi-provider support - OpenAI, Anthropic Claude, Google Gemini, Mistral, Groq, and more via abso-ai
📦 Batch processing - Process multiple files together for context-aware transformations
🔄 Processor chaining - Chain multiple AI processing steps together
💾 Incremental processing - Automatically tracks progress and resumes from failures
🔒 Secure API key management - Multiple secure options for storing credentials
⚙️ Config-driven - All settings in a single JSON file for easy automation
🔔 Auto-update notifications - Automatically checks for newer versions and notifies you

Installation

For End Users

Install globally with npm:

npm install -g phylo-cli

Verify installation:

phylo-cli --help

For Development

Clone the repository and link locally:

git clone <repository-url>
cd packages/cli
npm install
npm run build
npm link

To unlink during development:

npm unlink -g phylo-cli

Quick Start

Create a config file

phylo-cli --init

This creates phylo.config.json in your current directory.

Set up API keys (see API Key Configuration below)
Configure your project

Edit phylo.config.json:

{
  "input_folder": "./journals",
  "input_file_pattern": "**/*.md",
  "max_batch_size": null,
  "processors": {
    "main": {
      "prompt_files": ["prompts/analyze.md"],
      "model": "claude-sonnet-4-20250514",
      "output_folder": "./analysis",
      "output_file_extension": ".md"
    }
  }
}

Run the processor

phylo-cli --config phylo.config.json

API Key Configuration

Choose the method that best fits your security requirements:

Option 1: Environment Variables (RECOMMENDED)

Most secure for personal use. Set API keys in your shell profile:

# Add to ~/.zshrc or ~/.bashrc
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."

Pros:

Keys only exist in shell environment
No files to accidentally commit
Standard practice for CLI tools

Cons:

Need to set up on each machine
Keys visible in process environment

Option 2: Global Config File

Balanced security for CLI tools. Create a protected config file:

phylo-cli --setup-keys

This creates ~/.phylo/config with 0600 permissions (read/write for owner only):

{
  "api_keys": {
    "ANTHROPIC_API_KEY": "sk-ant-...",
    "OPENAI_API_KEY": "sk-..."
  }
}

Pros:

Keys stored in one place for all projects
Automatic permission checking
Easy to manage

Cons:

Keys stored in plaintext file
Vulnerable if system is compromised
Can be backed up to insecure locations

Security checks:

File permissions verified on every run
Warning displayed if permissions too permissive
Must manually create/edit (never auto-populated)

Option 3: Project .env File

Not recommended for CLI tools (better for per-project apps):

Create .env in your project directory:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

⚠️ WARNING: Add .env to .gitignore to prevent committing secrets!

Option 4: Config File Env Object

For automation/CI only:

Add directly to phylo.config.json:

{
  "env": {
    "ANTHROPIC_API_KEY": "sk-ant-...",
    "OPENAI_API_KEY": "sk-..."
  }
}

⚠️ WARNING: Never commit config files with API keys to git!

Configuration Priority

When multiple sources are present, priority is (lowest to highest):

System environment variables
Global config (~/.phylo/config)
Project .env file
Config JSON env object (highest)

Configuration

Basic Configuration

{
  "input_folder": "./input",
  "input_file_pattern": "**/*.md",
  "max_batch_size": null,
  "last_processed_file": null,
  "processors": {
    "main": {
      "prompt_files": ["prompts/instructions.md"],
      "model": "gpt-4o",
      "output_folder": "./output",
      "output_file_extension": ".md"
    }
  }
}

Processor Options

Each processor can have:

prompt - Inline prompt string
prompt_file - Path to single prompt file
prompt_files - Array of prompt files (concatenated)
model - AI model to use (default: gpt-4o)
output_folder - Where to save results (for final processor)
output_processor - Name of next processor in chain
output_file_extension - Extension for output files (default: .txt)

Chaining Processors

Process files through multiple AI steps:

{
  "processors": {
    "extract": {
      "prompt_files": ["prompts/extract-themes.md"],
      "model": "gpt-4o",
      "output_processor": "analyze"
    },
    "analyze": {
      "prompt_files": ["prompts/deep-analysis.md"],
      "model": "claude-sonnet-4-20250514",
      "output_processor": "summarize"
    },
    "summarize": {
      "prompt_files": ["prompts/create-summary.md"],
      "model": "gpt-4o",
      "output_folder": "./output",
      "output_file_extension": ".md"
    }
  }
}

Batch Processing

Group multiple files into single AI requests:

{
  "max_batch_size": 3
}

null or 1 - Process files individually
> 1 - Group N files per request (concatenated together)

Supported Models

Via abso-ai, supports:

OpenAI:

gpt-4o, gpt-4o-mini
gpt-4-turbo, gpt-4
o1-preview, o1-mini
o3-mini

Anthropic:

claude-sonnet-4-20250514
claude-opus-4-20250514
claude-3-5-sonnet-20241022
claude-3-opus-20240229

Google:

gemini-2.0-flash-exp
gemini-1.5-pro
gemini-1.5-flash

Others:

Mistral AI
Groq
Cohere
Together AI
Perplexity
Fireworks
OpenRouter (access to 100+ models)

Commands

`phylo-cli --init`

Generate a config file with all available options.

Creates phylo.config.json in current directory.

`phylo-cli --config <path>`

Process files using the specified config file.

phylo-cli --config my-config.json

`phylo-cli --setup-keys`

Create global API key configuration file.

Creates ~/.phylo/config with secure permissions (0600).

Usage Examples

Journal Analysis

# Initialize config
phylo-cli --init

# Edit phylo.config.json
{
  "input_folder": "./journals/2024",
  "processors": {
    "analyze": {
      "prompt": "Analyze this journal entry for emotional themes and insights.",
      "model": "claude-sonnet-4-20250514",
      "output_folder": "./analysis"
    }
  }
}

# Run analysis
phylo-cli --config phylo.config.json

Document Translation

{
  "input_folder": "./docs/en",
  "processors": {
    "translate": {
      "prompt": "Translate this markdown document to Spanish. Preserve formatting.",
      "model": "gpt-4o",
      "output_folder": "./docs/es"
    }
  }
}

Multi-Stage Processing

{
  "input_folder": "./articles",
  "max_batch_size": 5,
  "processors": {
    "extract": {
      "prompt": "Extract key points from these articles.",
      "model": "gpt-4o-mini",
      "output_processor": "synthesize"
    },
    "synthesize": {
      "prompt": "Create a comprehensive synthesis of these key points.",
      "model": "claude-sonnet-4-20250514",
      "output_folder": "./summaries"
    }
  }
}

Security Best Practices

1. File Permissions

The CLI automatically checks permissions on ~/.phylo/config:

# Should be 0600 (rw-------)
ls -la ~/.phylo/config

# Fix if needed
chmod 600 ~/.phylo/config

2. Git Safety

Always add to .gitignore:

.env
*.config.json
phylo.config.json

The --init command creates this automatically.

3. API Key Validation

The CLI validates API key formats:

Anthropic keys should start with sk-ant-
OpenAI keys should start with sk-

Warnings are displayed for invalid formats.

4. Comparison of Methods

| Method | Security | Convenience | Use Case | |--------|----------|-------------|----------| | Environment Variables | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | Personal laptop, most secure | | Global Config | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | CLI tools, good balance | | Project .env | ⭐⭐⭐ | ⭐⭐⭐⭐ | Per-project apps only | | Config JSON | ⭐⭐ | ⭐⭐⭐⭐⭐ | CI/CD automation only |

5. What NOT to Do

❌ Don't commit API keys to git

Check with: git log --all -S "sk-ant-"

❌ Don't use config JSON for personal projects

Easy to accidentally commit

❌ Don't share config files

Even in private messages

❌ Don't use permissive file permissions

Never 0644 or 0666 on key files

Auto-Update Notifications

The CLI automatically checks for updates on every run:

Checks npm registry for latest version (3 second timeout)
Compares with your installed version
Displays update message if newer version available
Non-blocking - runs in background and silently fails if network unavailable

Example output when update is available:

Update available: 1.0.0 → 1.2.0
Run: npm install -g phylo-cli

To update:

npm install -g phylo-cli

To check current version:

phylo-cli --version

Troubleshooting

"Config file already exists"

The --init command won't overwrite existing configs. Rename or delete the old one first.

"WARNING: Global config file has insecure permissions"

Fix with:

chmod 600 ~/.phylo/config

"No new items to process"

All files have been processed. To reprocess:

{
  "last_processed_file": null
}

API Key Not Found

Check priority order:

Is it in environment? echo $ANTHROPIC_API_KEY
Is it in ~/.phylo/config? cat ~/.phylo/config
Is it in .env? cat .env
Is it in config JSON? Check env object

Update Check Not Working

The update check:

Times out after 3 seconds
Fails silently if network unavailable
Requires internet connection to npm registry
Only runs when you execute a command

This is normal and won't affect functionality.

Development

Building

npm run build

Testing Locally

npm link
phylo-cli --help

Publishing to npm

Update version in package.json
Build the package:

npm run build

npm login

Publish:

npm publish

Verify:

npm install -g phylo-cli

License

MIT

Related Projects

abso-ai - Universal AI API client
phylo-processor - Core processing library

Contributing

Contributions welcome! Please open an issue or PR.

Support

For issues and questions:

GitHub Issues: [repository issues URL]
Documentation: [repository URL]

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

phylo-cli

Features

Installation

For End Users

For Development

Quick Start

API Key Configuration

Option 1: Environment Variables (RECOMMENDED)

Option 2: Global Config File

Option 3: Project .env File

Option 4: Config File Env Object

Configuration Priority

Configuration

Basic Configuration

Processor Options

Chaining Processors

Batch Processing

Supported Models

Commands

phylo-cli --init

phylo-cli --config <path>

phylo-cli --setup-keys

Usage Examples

Journal Analysis

Document Translation

Multi-Stage Processing

Security Best Practices

1. File Permissions

2. Git Safety

3. API Key Validation

4. Comparison of Methods

5. What NOT to Do

Auto-Update Notifications

Troubleshooting

"Config file already exists"

"WARNING: Global config file has insecure permissions"

"No new items to process"

API Key Not Found

Update Check Not Working

Development

Building

Testing Locally

Publishing to npm

License

Related Projects

Contributing

Support

`phylo-cli --init`

`phylo-cli --config <path>`

`phylo-cli --setup-keys`