ai-data-analyst
v1.1.1
Published
AI-powered data analysis CLI tool with visualization capabilities using Claude models - Built by BoVerse.io Team
Maintainers
Readme
AI Data Analyst
Built by BoVerse.io Team 🚀
A powerful CLI tool for AI-powered data analysis using Claude models. Analyze CSV, JSON, and Excel files with natural language queries and generate insightful visualizations directly in your terminal.
Features
- Natural Language Analysis: Ask questions about your data in plain English
- Multiple File Formats: Support for CSV, JSON, and Excel (.xlsx, .xls)
- Claude AI Models: Choose from multiple Claude models (3.5 Sonnet, Opus, Sonnet, Haiku)
- Interactive Mode: Conversational analysis with context retention
- Visualizations: ASCII charts (bar, line, histogram, tables) in the terminal
- Statistical Analysis: Automatic descriptive statistics and data profiling
- Streaming Output: Real-time analysis streaming
- Flexible Authentication: Support for OpenRouter and direct Anthropic API
- Secure Configuration: Encrypted storage of API credentials
Installation
Global Installation (Recommended)
npm install -g ai-data-analystLocal Installation
npm install ai-data-analystQuick Start
1. Setup Configuration
Run the setup wizard to configure your API credentials:
ai-data-analyst setupOr use the short alias:
ada setupYou'll be prompted to:
- Select your API provider (OpenRouter or Anthropic)
- Enter your API key
- Choose your default Claude model
2. Analyze Your First Dataset
ai-data-analyst analyze data.csv --query "What are the main trends in this data?"3. Interactive Mode
Start an interactive session for exploratory analysis:
ai-data-analyst interactive data.csvAPI Key Management
Secure Storage Options
Option 1: Use the Setup Command (Recommended)
The setup command stores your API key in a secure config file:
ai-data-analyst setupConfig is stored at: ~/.ai-data-analyst/config.json
Security Features:
- Stored outside your project directory
- File permissions set to user-only (chmod 600)
- Not committed to version control
- Can be encrypted using OS keychain (see below)
Option 2: Environment Variables
Create a .env file in your project (never commit this!):
OPENROUTER_API_KEY=your_key_here
# OR
ANTHROPIC_API_KEY=your_key_hereImportant: Always add .env to .gitignore:
# .gitignore
.env
.env.local
.env.*.localOption 3: System Environment Variables
Add to your shell profile (~/.zshrc, ~/.bashrc, etc.):
export OPENROUTER_API_KEY="your_key_here"Option 4: OS Keychain Integration (Most Secure)
For production use, integrate with your OS keychain:
macOS:
# Store API key in Keychain
security add-generic-password -a $USER -s ai-data-analyst -w "your_api_key"
# Retrieve API key
security find-generic-password -a $USER -s ai-data-analyst -wLinux (using libsecret):
secret-tool store --label='AI Data Analyst' api-key openrouterWindows (Credential Manager): Use the Windows Credential Manager or PowerShell:
cmdkey /generic:"ai-data-analyst" /user:"api-key" /pass:"your_api_key"Getting API Keys
OpenRouter (Recommended)
- Visit OpenRouter
- Sign up and navigate to API Keys
- Create a new API key
- OpenRouter provides access to all Claude models with unified pricing
Anthropic Direct API
- Visit Anthropic Console
- Sign up and navigate to API Keys
- Create a new API key
- Requires separate Anthropic account
Commands
setup
Configure API credentials and default settings
ai-data-analyst setupanalyze <file>
Analyze a data file with a query
ai-data-analyst analyze sales.csv --query "What are the top selling products?"
# Options:
# -q, --query <query> Analysis query
# -m, --model <model> Specify Claude model
# -v, --visualize Generate visualizations
# -s, --stats Show statistical summary
# --stream Stream analysis outputExamples:
# Basic analysis
ada analyze data.csv -q "Summarize this dataset"
# With visualizations
ada analyze sales.csv -q "Show sales trends" -v
# With stats and streaming
ada analyze metrics.json -s --stream -q "What patterns exist?"
# Using specific model
ada analyze data.xlsx -m anthropic/claude-3-opus -q "Deep analysis"interactive <file> or i <file>
Start interactive analysis session
ai-data-analyst interactive data.csv
# Options:
# -m, --model <model> Specify Claude modelInteractive Commands:
- Type questions to analyze data
/stats- Show statistical summary/viz [type]- Generate visualization (table, bar, line, histogram)/table- Show data table/columns- List all columns/history- Show conversation history/clear- Clear history/help- Show help/exit- Exit interactive mode
visualize <file> or viz <file>
Generate visualizations
ai-data-analyst visualize data.csv --type bar --x-column date --y-column revenue
# Options:
# -t, --type <type> Visualization type (table, bar, line, histogram)
# -x, --x-column <column> X-axis column
# -y, --y-column <column> Y-axis column
# --limit <number> Limit rows (default: 20)models
List available Claude models
ai-data-analyst modelsconfig
Show or clear configuration
# Show current config
ai-data-analyst config
# Clear config
ai-data-analyst config --clearSupported File Formats
CSV Files
ada analyze sales.csvJSON Files
ada analyze data.jsonSupports both:
- Array of objects:
[{...}, {...}] - Single object:
{...}(will be wrapped in array)
Excel Files
ada analyze report.xlsxSupports .xlsx and .xls formats. Reads the first sheet by default.
Available Claude Models
| Model | ID | Best For |
|-------|-----|----------|
| Claude 3.5 Sonnet | anthropic/claude-3.5-sonnet | Most intelligent, complex analysis |
| Claude 3 Opus | anthropic/claude-3-opus | Deep analytical tasks |
| Claude 3 Sonnet | anthropic/claude-3-sonnet | Balanced performance |
| Claude 3 Haiku | anthropic/claude-3-haiku | Fast, efficient analysis |
Usage Examples
Example 1: Sales Data Analysis
# Analyze sales trends
ada analyze sales.csv -q "What are the monthly sales trends?" -v --stream
# Interactive exploration
ada i sales.csv
> What products have the highest revenue?
> Show me a bar chart of sales by category
> /viz barExample 2: Customer Data Insights
# Get customer demographics
ada analyze customers.json -q "Analyze customer demographics and segment patterns" -s
# Generate visualizations
ada viz customers.json --type histogram --y-column ageExample 3: Financial Analysis
# Analyze financial metrics
ada analyze financials.xlsx \
-q "Identify cost optimization opportunities and revenue trends" \
-m anthropic/claude-3-opus \
-v
# Quick stats
ada viz financials.xlsx --type table --limit 50Example 4: Programmatic Usage
You can also use the library programmatically in Node.js:
const { DataLoader, AIClient, DataAnalyzer, ConfigManager } = require('@ai-tools/data-analyst');
const configManager = new ConfigManager();
const config = {
apiKey: configManager.getApiKey(),
provider: 'openrouter',
model: 'anthropic/claude-3.5-sonnet',
maxTokens: 4096,
temperature: 0.7
};
const dataLoader = new DataLoader();
const aiClient = new AIClient(config);
const analyzer = new DataAnalyzer(aiClient);
(async () => {
const dataFile = await dataLoader.loadFile('data.csv');
const result = await analyzer.analyzeData({
data: dataFile,
query: 'What insights can you find?'
});
console.log(result.analysis);
})();Advanced Features
Streaming Analysis
For long analyses, use streaming mode to see results in real-time:
ada analyze large-dataset.csv --query "Comprehensive analysis" --streamContext-Aware Interactive Mode
The interactive mode maintains conversation context:
ada i data.csv
> What's the average revenue?
AI: The average revenue is $45,230...
> How does this compare to last year?
AI: Compared to last year... (remembers previous context)Custom Model Selection
Override the default model for specific analyses:
# Use the most powerful model for complex analysis
ada analyze data.csv -m anthropic/claude-3-opus -q "Complex statistical analysis"
# Use fastest model for quick checks
ada analyze data.csv -m anthropic/claude-3-haiku -q "Quick summary"Security Best Practices
- Never commit API keys to version control
- Use environment variables or the config manager
- Rotate API keys regularly
- Use OS keychain for production deployments
- Set restrictive file permissions on config files:
chmod 600 ~/.ai-data-analyst/config.json - Use separate keys for development and production
- Monitor API usage through your provider dashboard
Troubleshooting
API Key Issues
# Check current configuration
ada config
# Reconfigure
ada setup
# Clear and start fresh
ada config --clear
ada setupFile Loading Errors
# Verify file exists and has correct format
ada viz data.csv --type tableModel Not Available
# List all available models
ada models
# Use a different model
ada analyze data.csv -m anthropic/claude-3-haiku -q "test"OpenRouter vs Anthropic API
- OpenRouter: Provides unified access to all Claude models, simpler setup
- Anthropic Direct: Direct access, may have different rate limits
Performance Tips
- Use appropriate models: Haiku for simple tasks, Opus for complex analysis
- Limit data size: For very large files, consider sampling first
- Use streaming: For long analyses to see progress
- Interactive mode: More efficient for multiple related queries
Development
Clone and Install
git clone https://github.com/yourusername/ai-data-analyst.git
cd ai-data-analyst
npm installBuild
npm run buildDevelopment Mode
npm run devTesting Locally
# Link locally
npm link
# Test commands
ada setup
ada analyze test-data.csvContributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
MIT
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
Roadmap
- [ ] Export analysis reports (PDF, HTML)
- [ ] Support for more data sources (databases, APIs)
- [ ] Advanced statistical tests
- [ ] Custom visualization templates
- [ ] Data transformation capabilities
- [ ] Multi-file analysis
- [ ] Collaborative analysis sessions
- [ ] Web dashboard interface
Acknowledgments
- Powered by Anthropic's Claude
- Built with TypeScript and Node.js
- Uses OpenRouter for unified API access
Built with ❤️ by BoVerse.io Team
Empowering data analysis with AI
Multi-File Analysis (New in v1.1.0!)
Analyze multiple data files in a directory at once. The tool will automatically:
- Find all CSV, JSON, and Excel files in the specified directory
- Display a summary of all files
- Show common columns across files
- Ask you what you want to analyze
- Provide comprehensive cross-file insights
Usage
# Analyze all files in current directory
ada analyze-folder
# or use the short alias
ada mf
# Analyze files in a specific directory
ada analyze-folder ./data
# With streaming output
ada analyze-folder --streamExample Workflow
$ ada mf
📂 Files Found:
1. sales.csv
CSV | 20 rows | 6 columns
2. customers.json
JSON | 10 rows | 8 columns
🔗 Common Columns:
customer_id, date
📊 Combined Summary:
Total files: 2
File types: csv, json
Total rows: 30
Unique columns across all files: 12
What would you like to know about these datasets?
> Find correlations between sales and customer demographics
🔍 Analyzing multiple datasets...What It Can Do
- Cross-file patterns: Find relationships between different datasets
- Data integration: Suggest how files can be combined
- Comparative analysis: Compare metrics across files
- Common column detection: Identify potential join keys
- Multi-dataset insights: Generate insights that span multiple files
