node-red-contrib-linux-agent-devops

v1.1.10

Published

3 days ago

Agent DevOps Multilingue avec Failover et Coopération

0High
0Medium
0Low

Linux Agent DevOps – Node-RED Module

A Telegram-powered SRE/DevOps copilot that diagnoses, executes, and corrects Linux commands autonomously using free AI models (Gemini, OpenRouter, DeepSeek).

Author: surprise_dev (Charles Poittevin)
Contact: [email protected]

Overview

Linux Agent DevOps is a custom Node-RED node that turns your Telegram bot into an intelligent SRE assistant.
It bridges natural language requests with autonomous Linux command execution, using a feedback loop where the AI reads terminal output and self-corrects.

Key Features

🤖 AI-powered Linux command generation (Gemini 2.0 Flash, OpenRouter free models, DeepSeek).
🔄 Autonomous error detection and self-correction via a terminal feedback loop.
📟 Real-time terminal output streamed to Telegram.
🌍 Language-agnostic with automatic user language detection.
⚙️ Production-focused: execution timeout, loop safety limits, optional sudo use.

Core Architecture

Recommended flow:

[Telegram Receiver] → [Linux Agent DevOps] → [Telegram Sender]

Self-Correction Loop

User Input
A Telegram message arrives (for example: “Run a performance audit”).
AI Generation
The node sends a system prompt + user text to one of the AI engines (Gemini / OpenRouter / DeepSeek).
Strict JSON Output

The model must always return:

{
  "speech": "analysis text",
  "cmd": "linux command or none"
}

Command Execution
If cmd !== "none", the node executes it via exec() (with a safe timeout).
Output Feedback
Stdout/stderr (truncated to fit Telegram limits) is sent back to the user along with the executed command.
Loop Trigger
The node re-emits itself with:

content: "RESULT: <output>"
loopCount: loopCount + 1

Mission Complete
The loop stops when:
- The speech field contains MISSION_TERMINÉE, or
- loopCount exceeds 8.

This design ensures the AI does not just “guess” commands, but validates them against real terminal feedback and can refine its strategy.

Use Cases

1. DevOps Automation & Incident Response

Scenario: Production CPU spike at 2 AM.
Command: “Diagnose high CPU usage and suggest fixes”.
Agent flow:
- Runs top -b -n 1, ps aux --sort=-%cpu, lsof -p <PID>.
- Identifies runaway processes.
- Proposes kill, service restart, or log cleanup.
- You confirm via Telegram; the agent executes.

2. Application Development & Debugging

Scenario: Node.js server won’t start.
Command: “Check why my app server won't start”.
Agent flow:
- Checks processes: lsof -i :3000, ps aux | grep node.
- Inspects logs: tail -f /var/log/app.log, journalctl -u myapp.
- Detects missing dependencies, permission issues, or port conflicts.
- Suggests targeted fixes.

3. Ethical Hacking & Security Baseline

Scenario: Authorized internal security audit.
Command: “Run a security baseline audit on this server”.
Agent flow:
- System info: uname -a, cat /etc/os-release.
- Network: netstat -tlnp / ss -tlnp (open ports & services).
- Permissions: find / -perm -4000 -type f (SUID).
- Users/groups, sudo paths, firewall rules, patch level.
- Produces a concise baseline report.

4. Infrastructure Monitoring & Health Checks

Scenario: Daily infrastructure health report.
Command: “Give me a daily health check report”.
Agent flow:
- Disk: df -h, du -sh *.
- Memory/swap: free -h, vmstat.
- Network: ip addr show, ping.
- Services: systemctl --failed.
- Containers: docker ps, docker stats.
- Returns a summary + alerts on thresholds.

5. Backup & Disaster Recovery

Scenario: Weekly backup verification.
Command: “Verify last backup and simulate restore”.
Agent flow:
- Locates backup files, checks integrity (md5sum).
- Performs dry-run restore (mysqldump --no-data / pg_dump --schema-only).
- Reports size, age, restore feasibility; can clean up old backups.

6. Log Analysis & Troubleshooting

Scenario: Error spike in application logs.
Command: “Analyze app errors in the last 2 hours”.
Agent flow:
- Extracts errors with grep.
- Aggregates and sorts by frequency.
- Correlates with journalctl system logs.
- Suggests remediation: rate limiting, resource tuning, restarts, etc.

Quick Setup

Prerequisites

Node-RED (version ≥ 2.0).
Telegram bot token (via @BotFather).
API keys:
- Google Gemini API key, and/or
- OpenRouter API key, optionally DeepSeek.
Linux/Unix system with standard CLI tools.

Installation

Telegram nodes

npm install node-red-contrib-telegrambot

Linux Agent DevOps

npm install node-red-contrib-linux-agent-devops

Basic flow

[Telegram Receiver] → [Linux Agent Devops] → [Telegram Sender]

Configure the node

In Linux Agent Devops:
- Set Gemini / OpenRouter / DeepSeek API keys.
- Set default Chat ID (or let Telegram nodes populate msg.payload.chatId).
- Optionally enable Allow SUDO (only if your Node-RED user has a controlled, passwordless sudo).
In Telegram nodes:
- Configure bot token and chat ID.

Test

Deploy the flow.
Send: Show me CPU usage.
The agent will execute commands and send the results back to Telegram.

Free Model Options

Option 1 – Google Gemini (recommended)

Model: gemini-2.0-flash-exp.
Pros:
- Native JSON output via responseMimeType: "application/json".
- Generous free tier for development and moderate use.

Option 2 – OpenRouter

Free models: Mistral, Llama, Hermes, and others tagged :free.
Pros:
- Easy model switching behind a unified API.
- Good fit for experimentation and non‑critical workloads.

Configuration & Tuning

Example Omni-Prompt

You are "Linux Agent Devops", a SRE and DevOps Root expert.
MISSIONS: Backup, automation, diagnostics, and security.
RULES:
1. Detect the user's language and respond in that language.
2. You are autonomous: install missing tools if necessary.
3. MISSION END: You MUST write "MISSION_TERMINÉE" to stop the loop.
STRICT JSON FORMAT: {"speech": "Linux Agent Devops analysis", "cmd": "linux command or none"}.

Important Parameters

loopCount limit (default ~8)
Increase for complex tasks, decrease for safety.
exec timeout
Tune according to expected command duration (e.g. 15–60 seconds).
Output truncation
Keep command output under Telegram’s message limit (~4096 chars).
Common practice: substring(0, 700–1000).
Delay between loops
1–1.5 seconds is typically enough to avoid rate limits.

Expected JSON Response

The AI engine must always return:

{
  "speech": "Brief human-readable analysis or status message",
  "cmd": "linux command to execute or 'none' if no command needed"
}

Security Considerations

Allow SUDO should only be enabled when:
- You fully understand the risk of executing AI-generated commands.
- The Node-RED user is restricted in sudoers (whitelisted commands, no full root shell).
For production environments:
- Consider adding command whitelists or sandboxing.
- Run the agent in a separate VM or container.
- Log all executed commands and outputs for audit.

Troubleshooting

AI returns syntax error / invalid JSON
- Verify that the system prompt enforces strict JSON.
- Test the model’s JSON endpoint directly before wiring into Node-RED.
Commands execute but no visible output
- Some tools write to stderr instead of stdout.
- Combine both: stdout || stderr || "completed".
Loop never stops
- Ensure the model writes MISSION_TERMINÉE exactly in the speech field.
- The loopCount safeguard stops hard after the configured limit.
Telegram messages too long
- Reduce truncation length, or split output into multiple messages.

Development & Contributing

This node is part of the surprise_dev ecosystem on npm:
https://www.npmjs.com/~surprise_dev

Contributions are welcome:

Additional AI engine integrations.
Stronger security (sandboxing, whitelisting, RBAC ideas).
Better error handling and edge-case coverage.
Performance optimizations and real-world SRE playbooks.

For questions, feature requests, or collaboration:

Email: [email protected]