@llamafarm/llamafarm
v0.1.0
Published
๐พ Plant and harvest AI models, agents, and databases into single deployable binaries
Downloads
9
Maintainers
Readme
๐พ LLaMA Farm CLI
Deploy AI models, agents, and databases into single deployable binaries - no cloud required.
Installation
npm install -g @llamafarm/llamafarmQuick Start
# Deploy a model
llamafarm plant llama3-8b
# Deploy with optimization
llamafarm plant llama3-8b --optimize
# Deploy to specific target
llamafarm plant mistral-7b --target raspberry-pi
# Development/Testing (no model download)
llamafarm plant llama3-8b --mockComplete Workflow Example
# 1. Plant - Configure your AI deployment
llamafarm plant llama3-8b \
--device mac-arm \
--agent chat-assistant \
--rag \
--database vector
# 2. Bale - Compile to single binary
llamafarm bale ./.llamafarm/llama3-8b \
--device mac-arm \
--optimize
# 3. Harvest - Deploy anywhere
llamafarm harvest llama3-8b-mac-arm-v1.0.0.bin --run
# Or just copy and run directly (no dependencies needed!)
./llama3-8b-mac-arm-v1.0.0.binFeatures
- ๐ฏ One-Line Deployment - Deploy complex AI models with a single command
- ๐ฆ Zero Dependencies - Compiled binaries run anywhere
- ๐ 100% Private - Your data never leaves your device
- โก Lightning Fast - 10x faster than traditional deployments
- ๐พ 90% Smaller - Optimized models use fraction of original size
Commands
plant
Deploy a model to create a standalone binary.
llamafarm plant <model> [options]
Options:
--target <platform> Target platform (mac, linux, windows, raspberry-pi)
--optimize Enable size optimization
--agent <name> Include an agent
--rag Enable RAG pipeline
--database <type> Include database (vector, sqlite)Examples
# Basic deployment
llamafarm plant llama3-8b
# Deploy with RAG and vector database
llamafarm plant mixtral-8x7b --rag --database vector
# Deploy optimized for Raspberry Pi
llamafarm plant llama3-8b --target raspberry-pi --optimize
# Deploy with custom agent
llamafarm plant llama3-8b --agent customer-servicebale
๐ฏ The Baler - Compile your deployment into a single executable binary.
llamafarm bale <project-dir> [options]
Options:
--device <platform> Target platform (mac, linux, windows, raspberry-pi)
--output <path> Output binary path
--optimize <level> Optimization level (none, standard, max)
--sign Sign the binary for distribution
--compress Extra compression (slower but smaller)The Baler packages everything into a single binary:
- ๐ง Quantized model (GGUF format)
- ๐ค Agent configuration & code
- ๐๏ธ Embedded vector database
- ๐ Web UI
- ๐ Node.js runtime
- ๐ง Platform-specific optimizations
Supported Platforms:
mac/mac-arm/mac-intel- macOS with Metal accelerationlinux/linux-arm- Linux with CUDA supportwindows- Windows with DirectML/CUDAraspberry-pi- Optimized for ARM devicesjetson- NVIDIA Jetson edge devices
Typical Binary Sizes:
- 7B models: 4-8GB (depending on quantization)
- 13B models: 8-13GB
- Mixtral: 25-45GB
Bale Examples
# Standard compilation
llamafarm bale ./.llamafarm/llama3-8b --device mac-arm
# Optimized for size
llamafarm bale ./.llamafarm/llama3-8b --device raspberry-pi --optimize max --compress
# Enterprise deployment with signing
llamafarm bale ./.llamafarm/mixtral --device linux --sign --output production.binharvest
Deploy and run a compiled binary.
llamafarm harvest <binary-or-url> [options]
Options:
--run Run immediately after deployment
--daemon Run as background service
--port <number> Override default port
--verify Verify binary integrityConfiguration
Create a llamafarm.yaml file for advanced configurations:
name: my-assistant
base_model: llama3-8b
plugins:
- vector_search
- voice_recognition
data:
- path: ./company-docs
type: knowledge
optimization:
quantization: int8
target_size: 2GBThen build:
llamafarm buildRequirements
- Node.js 18+
- 8GB RAM (minimum)
- 10GB free disk space
Documentation
For full documentation, visit https://docs.llamafarm.ai
Support
- ๐ Documentation
- ๐ฌ Discord Community
- ๐ Issue Tracker
Baler FAQ
Q: Can I run the binary on a different OS than where I compiled it?
A: No, you need to compile for each target platform. Use --device to specify the target.
Q: How much disk space do I need? A: During compilation, you need ~3x the final binary size. The final binary is typically 4-8GB for 7B models.
Q: Can I update the model without recompiling? A: No, the model is embedded in the binary. This ensures zero dependencies but means updates require recompilation.
Q: Does the binary need internet access? A: No! Everything runs completely offline once deployed.
License
MIT ยฉ LLaMA Farm Team
