agent-airlock

v0.1.6

Published

3 days ago

Human approval gate for AI agent actions. Drop-in protection against prompt injection for any API key.

0High
0Medium
0Low

theograe

ai agent security approval prompt-injection telegram human-in-the-loop

airlock

Approval gate for autonomous AI agents. Protect any API key from prompt injection and rogue behavior.

Works with any agent framework: Hermes, LangChain, CrewAI, AutoGPT, Claude Code, or raw API calls.

If you're running an autonomous AI agent that's: responding to messages, running cron jobs, or browsing the internet, airlock lets you give it API access without giving it full control. Your agent queues actions, you approve or reject them from your phone via Telegram.

You decide which actions need approval and which don't. Let your agent read timelines, search, and manage lists freely, but require your approval before it tweets, sends DMs, or makes payments. Fine-grained control without slowing down the work that's safe to automate.

Prerequisites

Before setting up airlock, make sure:

Your agent cannot read your API keys. Airlock stores secrets in its own .airlock/.env file, but if your agent runs as the same OS user, it can read that file. Run your agent as a separate OS user with no access to airlock's directory. On Linux:
```
# Create a restricted user for your agent
sudo useradd -m -s /bin/bash agent
# Your main user runs airlock and owns the secrets
# The agent user can only reach airlock via HTTP on localhost
```
Your agent cannot sudo. If it can, it can read anything. Remove sudo access for the agent user.
Your agent calls airlock instead of the API directly. Instead of giving your agent an X API key, give it access to http://localhost:4444/queue. Airlock holds the real keys and only uses them when you approve.
Node.js 18+

Without #1 and #2, airlock's approval gate can be bypassed. The approval flow is cryptographically secure, but it assumes the agent process is isolated from the secrets.

Why

AI agents with API keys have two failure modes:

Prompt injection - a crafted input tricks your agent into taking actions you never intended
Rogue behavior - the agent misinterprets instructions or makes bad judgment calls on its own

Airlock solves both by putting a cryptographically verified human approval step between your agent and any sensitive action. The agent can read, search, and organize freely - but anything that speaks as you, spends your money, or contacts someone requires your explicit approval via Telegram.

How it works

Agent                    Airlock                  You (Telegram)
  |                        |                          |
  |-- POST /queue -------->|                          |
  |   {type, text, ...}    |-- sends message -------->|
  |                        |   with real content      |
  |<-- {id, pending} ------|   + approve/reject       |
  |                        |                          |
  |                        |<-- tap approve --------- |
  |                        |   (verified real tap)    |
  |                        |-- execute action         |
  |                        |-- "Approved" ----------->|

Security layers

OS-level secret isolation - API keys live where the agent process cannot read them
HMAC tamper detection - content is hashed at queue time and verified at execution, so the agent can't modify a pending request after you've seen it
Telegram callback verification - the server verifies each button tap is a real Telegram callback, not a forged HTTP request
User ID allowlist - only your Telegram account can approve actions

Quick start

npx agent-airlock init

This walks you through setup:

Enter your Telegram bot token (create one via @BotFather)
Enter your Telegram user ID
Sends a test message to confirm it works

Then define your executors in airlock.config.ts:

import { type AirlockConfig } from 'agent-airlock'

const config: AirlockConfig = {
  // ... (generated by init)

  executors: {
    tweet: async (data) => {
      const res = await fetch('https://api.x.com/2/tweets', {
        method: 'POST',
        headers: { Authorization: `Bearer ${process.env.X_TOKEN}`, 'Content-Type': 'application/json' },
        body: JSON.stringify({ text: data.text }),
      })
      return { success: true, message: 'Tweet posted' }
    },

    email: async (data) => {
      // your email sending logic
      return { success: true, message: `Email sent to ${data.metadata?.to}` }
    },
  },
}

export default config

Start the server:

npx agent-airlock start

Agent integration

Your agent queues actions with a single HTTP call:

# Queue a tweet
curl -X POST http://localhost:4444/queue \
  -H "Content-Type: application/json" \
  -d '{"type": "tweet", "text": "hello world", "context": "engagement post"}'

# Queue an email
curl -X POST http://localhost:4444/queue \
  -H "Content-Type: application/json" \
  -d '{"type": "email", "text": "Meeting tomorrow?", "metadata": {"to": "[email protected]"}, "context": "follow-up"}'

# List pending approvals
curl http://localhost:4444/pending

The context field is optional - it lets the agent explain why it wants to take this action, which shows up in your Telegram message.

Works locally, on servers, anywhere Node runs. No public IP or domain required.

Architecture

Airlock runs one HTTP server on localhost and polls Telegram for approval button taps.

| Component | Port | Purpose | |-----------|------|---------| | Queue server | 4444 | Agent submits actions (localhost only) | | Telegram poller | - | Listens for approve/reject button taps |

The queue server binds to 127.0.0.1 - only local processes can reach it. Telegram communication uses long-polling, so no inbound ports or HTTPS certs are needed.

Files

.airlock/
  .env                 # Secrets (bot token, HMAC key)
  data/
    pending/           # Queued actions waiting for approval
    done/              # Resolved actions (approved/rejected)
airlock.config.ts      # Your executor definitions

Add .airlock/ to your .gitignore.

API

POST /queue

Queue an action for approval.

{
  "type": "tweet",
  "text": "The content to approve",
  "context": "Agent's reason for this action",
  "metadata": { "any": "extra data your executor needs" }
}

Returns: {"id": "a1b2c3d4", "status": "pending"}

GET /pending

List all pending approvals.

GET /health

Returns {"ok": true}.

License

MIT