midnight-ai

v0.1.1

Published

3 months ago

A powerful local and API-driven AI system assistant

0High
0Medium
0Low

midnightai

ai assistant terminal llama.cpp groq openai

🌙 Midnight

Local AI-powered voice assistant for OS automation through natural language.

Midnight combines speech recognition (STT), a Small Language Model (LLM 1.5B), and a secure terminal execution environment — all running locally on your machine.

Recommended Model: We recommend using Qwen2.5-Coder-1.5B-Instruct (GGUF). It's a "Small Language Model" (SLM) optimized for code and terminal logic. With 1.5 billion parameters, it's the perfect balance between intelligence (understanding complex commands) and efficiency (runs on standard hardware with ~1.2GB RAM usage).

Quick Start • Architecture • Safety • Contributing

Why Midnight?

Privacy First: No cloud APIs required. Your voice and data stay on your machine.
Secure by Design: Commands are validated against safety rules before execution.
Developer Friendly: Easily extensible with custom tools and logic.
Cross-Platform: Built to work on Windows, Linux, and macOS.

Architecture

Voice/Text Input
       │
       ▼
┌─────────────┐
│  STT Engine  │  faster-whisper (int8)
│  (optional)  │
└──────┬──────┘
       │ text
       ▼
┌─────────────┐
│   Router    │  (classify: terminal, chat, search)
└──────┬──────┘
       │
       ▼
┌─────────────┐
│   LLM Core  │  Qwen 1.5B / Cloud API
└──────┬──────┘
       │
       ├─────────────────────┐
       ▼                     ▼
┌─────────────┐     ┌──────────────────┐
│  Chat / TTS  │     │  Safety Layer     │
│  Response    │     │  (Guard + Rules)  │
└──────────────┘     └────────┬─────────┘
                              │
                              ▼
                    ┌──────────────────┐
                    │  Executor         │
                    │  subprocess       │
                    └──────────────────┘

Project Structure

midnight/
├── core/          # Config, engine orchestrator, session management
├── safety/        # Command validation, execution, rule engine
├── stt/           # Speech-to-text (faster-whisper + audio capture + VAD)
├── tts/           # Text-to-speech (Piper ONNX + playback)
├── llm/           # Model loading, LoRA adapter manager, router, prompts
└── ui/            # Rich terminal interface

training/          # QLoRA fine-tuning scripts for Colab/Kaggle
tests/             # Unit tests

Quick Start

# Install core dependencies
pip install -r requirements.txt

# Or install with specific components
pip install -e ".[stt]"    # + speech recognition
pip install -e ".[tts]"    # + voice synthesis
pip install -e ".[llm]"    # + language model
pip install -e ".[all]"    # everything

# Run
python main.py

Safety

Commands are parsed with shlex and validated before execution:

| Level | Commands | Behavior | |-------|----------|----------| | 0 (Safe) | ls, pwd, whoami, echo | Instant execution | | 1 (Normal) | mkdir, git, pip, cp | Execute with notice | | 2 (Dangerous) | rm, sudo, mv, chmod | Requires confirmation | | Blacklist | mkfs, dd, rm -rf / | Blocked |

Additional protections: pipe chain detection, redirect blocking, command substitution prevention.

Models

Model: Qwen2.5-1.5B-Instruct (GGUF, ~1.2 GB in 4-bit) or Cloud API (Gemini/OpenAI)
Inference Engine: llama.cpp (CPU/GPU) or API Client

Training

See training/README.md for fine-tuning instructions using QLoRA on Google Colab.

Requirements

Python 3.10+
RAM: 8 GB minimum
GPU: Optional (2+ GB VRAM recommended for STT)
OS: Windows, Linux, macOS

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme