@lukeocodes/vectornator
v1.1.0
Published
Maintain remote vector stores with your repository content - GitHub Action and CLI tool
Downloads
3
Maintainers
Readme
Luke's Vectornator
Maintain remote vector stores with your repository content. Automatically sync documentation, markdown files, and other text content to vector databases for AI applications.
Features
- 🔄 Automatic Synchronization: Keep vector stores in sync with your repository
- 📝 Smart Change Detection: Only sync files that have changed using content hashing
- 🎯 Metadata Rich: Sends comprehensive metadata with each file
- 🔌 Extensible: Support for multiple vector store providers
- 📦 Dual Usage: Works as both npm package and GitHub Action
- 🔐 Git Branch Storage: Store sync metadata in a dedicated git branch (no file clutter!)
- 🎨 Beautiful CLI: Colored output with progress indicators
Quick Start
As a GitHub Action
name: Sync to Vector Store
on:
push:
branches: [main]
paths:
- "docs/**"
- "*.md"
jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Required for git branch metadata
- uses: lukeocodes/vectornator@v1
with:
api-key: ${{ secrets.OPENAI_API_KEY }}
store-id: ${{ secrets.VECTOR_STORE_ID }}
directory: docs
patterns: "**/*.md,**/*.mdx"As an npm Package
# Install globally
npm install -g @lukeocodes/vectornator
# Or use with npx
npx @lukeocodes/vectornator sync --directory ./docsInstallation
npm Package
npm install -g @lukeocodes/vectornatorGitHub Action
Add to your workflow:
- uses: lukeocodes/vectornator@v1Configuration
Environment Variables
# OpenAI Provider
OPENAI_API_KEY=your-api-key
OPENAI_STORE_ID=your-store-id
# Metadata branch name (optional)
VECTORNATOR_METADATA_BRANCH=metadata/my-project
# Other providers (coming soon)
PINECONE_API_KEY=your-api-key
PINECONE_ENVIRONMENT=your-environmentSee Configuration Guide for more options.
CLI Options
vectornator sync [options]
Options:
-d, --directory <path> Directory to sync (default: ".")
-p, --provider <name> Vector store provider (default: "openai")
--patterns <patterns...> File patterns to include
--exclude <patterns...> File patterns to exclude
--dry-run Show what would be done without making changes
--metadata-storage <type> Metadata storage type: git-branch or file (default: git-branch)
--store-id <id> Vector store ID
--api-key <key> API key for the provider
-v, --verbose Verbose output
-h, --help Display helpGitHub Action Inputs
| Input | Description | Required | Default |
| ----------- | ---------------------------------------------- | -------- | --------------------------------- |
| api-key | API key for the vector store provider | Yes | - |
| store-id | Vector store ID | No | - |
| directory | Directory to sync | No | . |
| provider | Vector store provider | No | openai |
| patterns | File patterns to include (comma-separated) | No | **/*.md,**/*.mdx,**/*.txt |
| exclude | File patterns to exclude (comma-separated) | No | node_modules/**,.git/**,dist/** |
| dry-run | Show what would be done without making changes | No | false |
| verbose | Enable verbose output | No | false |
Usage Examples
Basic Sync
# Sync current directory
vectornator sync
# Sync specific directory
vectornator sync --directory ./docs
# Dry run to see what would happen
vectornator sync --dry-runCreate a New Vector Store
vectornator create-store "my-documentation"
# Output: Store ID: vs_abc123...List Files in Vector Store
vectornator listCustom Patterns
# Only sync markdown files
vectornator sync --patterns "**/*.md"
# Exclude test files
vectornator sync --exclude "**/test/**" "**/*.test.md"Metadata Storage
By default, Vectornator stores sync metadata in a dedicated git branch. This keeps your repository clean:
# View metadata
vectornator show-metadata
# Use file-based metadata instead
vectornator sync --metadata-storage fileMetadata Storage
Vectornator uses a dedicated git branch by default to store sync metadata. This means:
- ✅ No
.vectornatordirectory in your repo - ✅ Metadata is versioned and distributed with the repository
- ✅ Works seamlessly with GitHub Actions
- ✅ No timing issues between local and CI syncs
The metadata is stored in the metadata/vectornator branch and includes:
- File hashes for change detection
- Vector store file IDs
- Upload timestamps
- Version numbers
Supported Providers
OpenAI (Available Now)
const provider = new OpenAIProvider();
await provider.initialize({
apiKey: process.env.OPENAI_API_KEY,
storeId: process.env.OPENAI_STORE_ID,
});Coming Soon
- Pinecone: High-performance vector database
- Weaviate: Open-source vector search engine
- Qdrant: Vector similarity search engine
- ChromaDB: Open-source embedding database
Creating Custom Providers
Implement the VectorStoreProvider interface:
import { BaseVectorStoreProvider } from "@lukeocodes/vectornator";
export class MyCustomProvider extends BaseVectorStoreProvider {
name = "custom";
async validateConfig(): Promise<void> {
// Validate your configuration
}
async connect(): Promise<void> {
// Connect to your service
}
async uploadFile(
filePath: string,
content: Buffer,
metadata: FileMetadata
): Promise<string> {
// Upload file and return ID
}
// ... implement other required methods
}Development
# Clone the repository
git clone https://github.com/lukeocodes/vectornator.git
cd vectornator
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
# Development mode
npm run devTesting with GitHub Actions
During development, you can test the sync functionality using the test workflow:
# Go to Actions tab in GitHub and run "Test Sync Workflow"
# Or trigger via GitHub CLI:
gh workflow run test-sync.yml -f dry-run=true -f provider=openaiThe test workflow allows you to:
- Test different providers (openai, example)
- Toggle dry-run mode
- Test different metadata storage types
- Create test documents automatically
Architecture
vectornator/
├── src/
│ ├── types/ # TypeScript interfaces
│ ├── providers/ # Vector store providers
│ ├── core/ # Core sync engine
│ └── cli.ts # CLI interface
├── action.yml # GitHub Action definition
└── package.json # npm package definitionHow It Works
- File Discovery: Scans your repository for files matching patterns
- Change Detection: Computes SHA-256 hashes to detect changes
- Metadata Enrichment: Adds file metadata (size, path, timestamps)
- Smart Sync: Only uploads changed files, removes deleted files
- State Tracking: Stores sync state in git branch or local file
Metadata Storage Options
Vectornator supports two metadata storage strategies:
Git Branch (Default)
Uses a dedicated metadata/vectornator branch to store sync state:
- Metadata is independent of commits
- Works seamlessly with GitHub Actions
- No timing issues between local and CI syncs
- Automatically managed by the tool
# Default behavior
vectornator sync
# Explicitly specify git-branch storage
vectornator sync --metadata-storage git-branchThe GitHub Action automatically handles fetching and pushing the metadata branch.
File-based
Stores metadata in .vectornator/metadata.json:
- Simple and portable
- No git integration required
- Must be committed to share state between environments
# Use file storage
vectornator sync --metadata-storage file --metadata-file .vectornator/metadata.jsonBest Practices
- Use Specific Patterns: Target only the files you need in vector store
- Exclude Large Files: Vector stores work best with text content
- Regular Syncs: Set up CI/CD to sync on every push
- Monitor Usage: Track your API usage and costs
- Version Control: The metadata travels with your repository
Troubleshooting
"Vector store does not exist"
Create a new store:
vectornator create-store "my-docs""No metadata found"
For existing projects, run an initial sync:
vectornator sync --forceMetadata Branch Issues
If you need to reset the metadata branch:
# Delete local metadata branch
git branch -D metadata/vectornator
# Delete remote metadata branch
git push origin --delete metadata/vectornator
# Run sync again to recreate
vectornator syncContributing
Contributions are welcome! Please read our Contributing Guide for details.
License
MIT © Luke Oliff
Acknowledgments
- Inspired by the need to keep AI applications in sync with documentation
- Built with TypeScript and ❤️
