@emilshirokikh/slyos-sdk
v1.4.1
Published
SlyOS - On-Device AI SDK for Web and Node.js
Maintainers
Readme
🔥 @emilshirokikh/slyos-sdk
Official SDK for SlyOS on-device AI platform. Run AI models locally in browsers and Node.js.
📦 Installation
npm install @emilshirokikh/slyos-sdknpm: https://www.npmjs.com/package/@emilshirokikh/slyos-sdk
🚀 Quick Start
import SlyOS from '@emilshirokikh/slyos-sdk';
// 1. Initialize
const sdk = new SlyOS({
apiKey: 'sk_live_your_api_key'
});
await sdk.initialize();
// 2. Load model (downloads ~200MB once)
await sdk.loadModel('quantum-1.7b');
// 3. Generate responses
const response = await sdk.generate('quantum-1.7b',
'What is artificial intelligence?',
{
temperature: 0.7,
maxTokens: 100,
topP: 0.9
}
);
console.log(response);
// AI runs locally - no per-inference charges!📚 API Reference
Constructor
new SlyOS(config: SlyOSConfig)Config:
{
apiKey: string; // Get from dashboard
apiUrl?: string; // Optional, defaults to production
}Methods
initialize()
Authenticates with SlyOS backend and registers device.
await sdk.initialize();Returns: Promise<DeviceProfile>
loadModel(modelId)
Downloads and caches AI model locally.
await sdk.loadModel('quantum-1.7b');Parameters:
modelId(string): Model identifierquantum-1.7b- 900MB, recommendedquantum-3b- 1.6GB, high qualityquantum-code-3b- 1.6GB, code-optimizedquantum-8b- 4.2GB, best quality
Returns: Promise<void>
First call: Downloads model (~1-2 min)
Subsequent calls: Uses cached model (<1 sec)
generate(modelId, prompt, options?)
Generates AI response locally.
const response = await sdk.generate('quantum-1.7b',
'Tell me about your menu',
{
temperature: 0.7,
maxTokens: 150,
topP: 0.9
}
);Parameters:
modelId(string): Model to useprompt(string): Input textoptions(object, optional):temperature(0-2): Creativity (default: 0.7)maxTokens(10-2000): Max response length (default: 100)topP(0-1): Nucleus sampling (default: 0.9)
Returns: Promise<string> - Generated text
chatCompletion(modelId, request)
OpenAI-compatible chat completions.
transcribe(modelId, audio, options?)
Speech-to-text using voicecore models.
recommendModel(category?)
Returns best model for the current device's hardware.
searchModels(query, options?)
Search HuggingFace Hub for ONNX-compatible models.
getDeviceProfile()
Returns the device's hardware profile (CPU, RAM, GPU, screen, network).
getModelContextWindow()
Returns current model's context window size in tokens.
getDeviceId()
Returns the persistent device identifier.
destroy()
Flushes pending telemetry and cleans up timers. Call before shutting down.
await sdk.destroy(); // Ensures telemetry is sentgetSdkVersion()
Returns the current SDK version string (e.g. '1.4.0').
getAvailableModels()
Returns available models grouped by category (llm, stt).
canRunModel(modelId, quant?)
Checks if the current device can run a specific model based on hardware profile.
ragQuery(modelId, knowledgeBaseId, query, options?)
Performs a RAG query against a cloud-indexed knowledge base. Requires Hybrid RAG plan.
ragQueryLocal(modelId, knowledgeBaseId, query, options?)
Performs a RAG query using locally-cached embeddings for offline-capable retrieval.
ragQueryOffline(modelId, knowledgeBaseId, query, options?)
Fully offline RAG query using pre-synced knowledge base data.
syncKnowledgeBase(knowledgeBaseId)
Downloads and caches a knowledge base locally for offline RAG queries.
🌐 Platform Support
| Platform | Status | Notes | |----------|--------|-------| | Chrome | ✅ Supported | Recommended | | Safari | ✅ Supported | iOS 16+ | | Edge | ✅ Supported | Chromium-based | | Firefox | ⚠️ Limited | Some models work | | Node.js | ✅ Supported | v18+ | | React Native | 🚧 Coming Soon | Q3 2026 |
💡 Usage Examples
Basic Chatbot
import SlyOS from '@emilshirokikh/slyos-sdk';
const sdk = new SlyOS({ apiKey: 'sk_live_...' });
await sdk.initialize();
await sdk.loadModel('quantum-1.7b');
async function chat(userMessage) {
return await sdk.generate('quantum-1.7b', userMessage);
}
const response = await chat('What are your hours?');
console.log(response);With System Prompt
const systemPrompt = `You are a helpful assistant for McDonald's.
Help with menu, hours, and nutrition. Be friendly and concise.`;
const userMessage = 'What breakfast items do you have?';
const fullPrompt = `${systemPrompt}\n\nCustomer: ${userMessage}\nAssistant:`;
const response = await sdk.generate('quantum-1.7b', fullPrompt, {
temperature: 0.7,
maxTokens: 150
});React Integration
import { useState, useEffect } from 'react';
import SlyOS from '@emilshirokikh/slyos-sdk';
function Chatbot() {
const [sdk, setSdk] = useState(null);
const [loading, setLoading] = useState(true);
const [response, setResponse] = useState('');
useEffect(() => {
async function init() {
const client = new SlyOS({ apiKey: 'sk_live_...' });
await client.initialize();
await client.loadModel('quantum-1.7b');
setSdk(client);
setLoading(false);
}
init();
}, []);
async function handleChat(message) {
const reply = await sdk.generate('quantum-1.7b', message);
setResponse(reply);
}
if (loading) return <div>Loading AI...</div>;
return (
<div>
<button onClick={() => handleChat('Hello!')}>
Chat
</button>
<p>{response}</p>
</div>
);
}🔧 Advanced Configuration
Custom Backend URL
const sdk = new SlyOS({
apiKey: 'sk_live_...',
apiUrl: 'https://api.slyos.world'
});Multiple Models
await sdk.loadModel('quantum-1.7b');
await sdk.loadModel('quantum-3b');
// Use different models
const fast = await sdk.generate('quantum-1.7b', 'Quick question?');
const detailed = await sdk.generate('quantum-3b', 'Complex question?');📊 Performance
Benchmarks (Quantum 1.7B)
| Metric | Browser | Node.js | |--------|---------|---------| | First load | 60-120s | 30-60s | | Cached load | <1s | <0.5s | | Inference | 10-15 tok/s | 15-25 tok/s | | Memory | 1.2GB | 900MB |
🐛 Troubleshooting
Model won't load
// Check browser console for errors
// Ensure 2GB+ RAM available
// Try smaller model (quantum-1.7b)CORS errors
// Backend must allow your domain
// Check CORS_ORIGIN environment variableSlow inference
// Use smaller model
// Reduce maxTokens
// Check CPU/RAM availability🔒 Security
- API keys stored client-side (localStorage)
- All inference happens locally (private)
- Inference telemetry batched locally (flushed every 10 inferences or 60s)
- No user data sent to cloud
📦 Package Info
- Package:
@emilshirokikh/slyos-sdk - Version: 1.4.0
- License: MIT
- Size: 168 KB (unpacked)
- Dependencies: axios, @huggingface/transformers
🤝 Contributing
# Clone repo
git clone https://github.com/BeltoAI/sly.os.git
cd sly.os/sdk
# Install dependencies
npm install
# Make changes to src/index.ts
# Build
npm run build
# Test locally
npm link📄 License
MIT - See LICENSE file
🙏 Credits
Built with Hugging Face Transformers.js
📞 Support
- npm: https://www.npmjs.com/package/@emilshirokikh/slyos-sdk
- GitHub: https://github.com/BeltoAI/sly.os
- Docs: See main README.md
- Email: [email protected]
