testflow-ai

v0.5.5

Published

11 days ago

Declarative API testing powered by YAML flows. Replace Postman with version-controlled, AI-friendly test definitions.

Downloads

993

0High
0Medium
0Low

manacarbajal

testing api-testing declarative yaml e2e integration graphql rest http ai ollama flow ci-cd test-automation

🧪 testflow-ai

YAML API flows + optional LLM assertions (local Ollama or cloud)

Version-controlled • CI-friendly • Agent-friendly

Multi-step flows (create → capture → reuse → assert)
Validate complex responses with AI (privacy-first via Ollama)
Keep API context in Markdown (great for humans & agents)

Documentation • Quick Start • Examples • AI Providers

What is testflow-ai?

testflow-ai lets you describe API scenarios in YAML files, run them from the command line or as a library, and (optionally) ask an AI model to judge complex responses. No GUI, no vendor lock‑in, and it works with any HTTP/GraphQL API.

Born from real-world frustration:
After days of testing APIs with Postman and burning tokens with ChatGPT, I built this to centralize tests in version-controlled YAML files with local AI support.
I wanted something that felt more like a test agent: a tool that could create data, mutate it, delete it, and walk full flows end‑to‑end, but defined in plain files, close to the code, and easy to run in CI.
testflow-ai is that tool: a thin engine that turns YAML flows into real HTTP calls, variable captures, assertions, and (if you want) AI‑powered checks.

Why it's different

Most API testing tools are either GUI-first (collections) or code-first (JS/TS test code).
testflow-ai is flow-first: readable YAML that runs in CI — with an optional AI judge when classic assertions aren't enough.

What you get:

Flow engine: multi-step scenarios with capture + interpolation (CRUD, auth, webhooks, background jobs)
AI assertions: validate complex text/structured responses with natural language checks (Ollama/OpenAI/Anthropic)
Context-as-docs: a Markdown file that explains base URLs, endpoints, and rules — perfect input for AI agents too

When to use testflow-ai

You want version-controlled API E2E flows (not a GUI collection)
You need multi-step chaining (create → capture id → update → verify)
You want CI/CD-ready output (console/json/markdown + exit codes + no external deps)
You sometimes need an AI judge for fuzzy checks (content quality, summaries, "is this coherent?")

When NOT to use it

You only need schema/property-based fuzzing from OpenAPI
You prefer writing tests in code (Jest/Vitest) with full programmatic control
You need browser/UI testing (Playwright/Cypress territory)

What testflow-ai optimizes for

Key Features

Quick Start

npm i -D testflow-ai

Create context.md:

# My API

## Base URLs
- api: http://localhost:3000

Create tests/todo.yaml:

name: Todo flow
tags: [smoke]

steps:
  - name: Create todo
    request:
      method: POST
      url: "{api}/todos"
      headers:
        Content-Type: application/json
      body:
        title: "Buy milk"
        completed: false
    capture:
      - name: todoId
        path: data.id
    assertions:
      - path: status
        operator: equals
        value: 201

  - name: Fetch todo
    request:
      method: GET
      url: "{api}/todos/${todoId}"
    assertions:
      - path: data.title
        operator: equals
        value: "Buy milk"

Run:

npx testflow --context ./context.md tests/todo.yaml

That's it. No config files, no GUI, no account.

Installation

npm install testflow-ai
# or
pnpm add testflow-ai
# or
yarn add testflow-ai

CLI Usage

# Run specific files
npx testflow flow1.yaml flow2.yaml

# Run all YAML files in a directory
npx testflow --dir ./tests

# Use a context file for base URLs
npx testflow --dir ./tests --context ./context.md

# Filter by tags (run only smoke tests)
npx testflow --dir ./tests --tags smoke

# JSON output (for CI/CD)
npx testflow --dir ./tests --format json

# Markdown output (for reports)
npx testflow --dir ./tests --format markdown

# Verbose mode (see step-by-step execution)
npx testflow --dir ./tests -v

# With AI evaluation
npx testflow --dir ./tests --ai-provider ollama --ai-model llama3.2:3b
npx testflow --dir ./tests --ai-provider openai --ai-key $OPENAI_API_KEY --ai-model gpt-4o-mini

Programmatic API

Simple usage

import { runTests } from 'testflow-ai';

const report = await runTests({
  contextFile: './context.md',
  testDir: './tests',
  tags: ['smoke'],
  format: 'console',
  verbose: true,
});

console.log(`${report.passedFlows}/${report.totalFlows} passed`);
process.exit(report.failedFlows > 0 ? 1 : 0);

Here's a complete example using a Todo List API:

Project Structure

my-api/
├── tests/
│   ├── index.ts              # Test runner
│   ├── context.md            # API context
│   └── flows/
│       ├── todo-crud.yaml
│       └── todo-graphql.yaml
└── package.json

Test Runner (`tests/index.ts`)

import { runTests, type RunnerOptions } from 'testflow-ai';
import * as path from 'path';

async function main() {
  const options: RunnerOptions = {
    contextFile: path.join(__dirname, 'context.md'),
    testDir: path.join(__dirname, 'flows'),
    tags: process.argv.includes('--tags=smoke') ? ['smoke'] : undefined,
    format: 'console',
    verbose: false,
  };

  const report = await runTests(options);
  process.exit(report.failedFlows > 0 ? 1 : 0);
}

main();

Context File (`tests/context.md`)

# Todo List API

## Description
A simple REST API for managing todo items.

## Base URLs
- api: http://localhost:3000
- graphql: http://localhost:3000/graphql

## Endpoints
- POST /todos - Create a new todo
- GET /todos/:id - Get todo by ID
- PUT /todos/:id - Update todo
- DELETE /todos/:id - Delete todo
- POST /graphql - GraphQL endpoint

Test Flow (`tests/flows/todo-crud.yaml`)

name: Todo CRUD Flow
tags: [todos, crud, smoke]

steps:
  - name: Create todo
    request:
      method: POST
      url: "{api}/todos"
      headers:
        Content-Type: application/json
      body:
        title: "Buy groceries"
        completed: false
    capture:
      - name: todoId
        path: data.id
    assertions:
      - path: status
        operator: equals
        value: 201
      - path: data.title
        operator: equals
        value: "Buy groceries"

  - name: Get todo
    request:
      method: GET
      url: "{api}/todos/${todoId}"
    assertions:
      - path: data.id
        operator: equals
        value: "${todoId}"

  - name: Update todo
    request:
      method: PUT
      url: "{api}/todos/${todoId}"
      headers:
        Content-Type: application/json
      body:
        completed: true
    assertions:
      - path: data.completed
        operator: equals
        value: true

Running Tests

# Add to package.json scripts:
"test:e2e": "ts-node tests/index.ts"
"test:smoke": "ts-node tests/index.ts --tags=smoke"

# Then run:
npm run test:e2e
npm run test:smoke

Advanced usage

import { TestRunner, FlowExecutor, parseYamlFile, parseContextFile } from 'testflow-ai';

// Runner with full control
const runner = new TestRunner({
  contextFile: './context.md',
  testFiles: ['./tests/todo-crud.yaml'],
  ai: { provider: 'ollama', model: 'mistral:7b' },
});
const report = await runner.run();

// Manual execution
const context = await parseContextFile('./context.md');
const flow = await parseYamlFile('./tests/todo-crud.yaml');
const executor = new FlowExecutor(context, true);
const result = await executor.executeFlow(flow);

Basic structure

name: Todo Lifecycle
description: Create a todo and verify it exists
tags:
  - todos
  - smoke

steps:
  - name: Create todo
    request:
      method: POST
      url: "{api}/todos"
      headers:
        Content-Type: application/json
      body:
        title: "Buy groceries"
        completed: false
    capture:
      - name: todoId
        path: data.id
    assertions:
      - path: status
        operator: equals
        value: 201
      - path: data.title
        operator: equals
        value: "Buy groceries"

  - name: Verify todo
    request:
      method: GET
      url: "{api}/todos/${todoId}"
    assertions:
      - path: data.id
        operator: equals
        value: "${todoId}"

GraphQL requests

steps:
  - name: Query todo
    request:
      method: POST
      url: "{graphql}"
      graphql:
        query: |
          query GetTodo($id: ID!) {
            todo(id: $id) {
              id
              title
              completed
            }
          }
        variables:
          id: "${todoId}"
    capture:
      - name: todoTitle
        path: data.todo.title

Variable capture and interpolation

Variables captured in one step are available in all subsequent steps:

steps:
  - name: Create todo
    request:
      method: POST
      url: "{api}/todos"
      headers:
        Content-Type: application/json
      body:
        title: "Read docs"
        completed: false
    capture:
      - name: todoId
        path: data.id
      - name: todoTitle
        path: data.title

  - name: Verify todo title
    request:
      method: GET
      url: "{api}/todos/${todoId}"
    assertions:
      - path: data.title
        operator: equals
        value: "${todoTitle}"

Supported patterns:

${variable} — simple variable
${data.nested.field} — nested path
${items[0].id} — array access

Async polling (waitUntil)

For operations that take time — polls until condition is met or timeout:

steps:
  - name: Wait for todo sync
    request:
      method: GET
      url: "{api}/todos/${todoId}/sync-status"
    waitUntil:
      path: data.status
      operator: equals
      value: "SYNCED"
      timeout: 30000    # max wait (ms)
      interval: 2000    # poll every (ms)
    assertions:
      - path: data.status
        operator: equals
        value: "SYNCED"

Special paths:

status — HTTP status code (when value is a number)
httpStatus — always the HTTP status code
data.field — response body field
data.items[0].id — array access

AI-Powered Evaluation

Use AI to assert things that are hard to express with traditional operators. testflow-ai supports multiple providers:

Ollama (Local, Recommended)

No cloud API keys, no data leaves your machine.

Install Ollama — ollama.com/download
Pull a model:

# Recommended — good balance of speed and quality
ollama pull llama3.2:3b

# Faster, lighter (for limited hardware)
ollama pull llama3.2:1b

# More accurate (needs ~8GB RAM)
ollama pull mistral:7b

Start Ollama (runs on http://localhost:11434 by default):

ollama serve

Usage:

# CLI
npx testflow --dir ./tests --ai-provider ollama --ai-model llama3.2:3b

# Programmatic
const report = await runTests({
  testDir: './tests',
  ai: {
    provider: 'ollama',
    url: 'http://localhost:11434',
    model: 'llama3.2:3b',
  },
});

OpenAI (Cloud)

Requires API key from platform.openai.com

# CLI
npx testflow --dir ./tests \
  --ai-provider openai \
  --ai-key $OPENAI_API_KEY \
  --ai-model gpt-4o-mini

# Programmatic
const report = await runTests({
  testDir: './tests',
  ai: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: 'gpt-4o-mini',
  },
});

Supported models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo

Anthropic (Cloud)

Requires API key from console.anthropic.com

# CLI
npx testflow --dir ./tests \
  --ai-provider anthropic \
  --ai-key $ANTHROPIC_API_KEY \
  --ai-model claude-3-haiku-20240307

# Programmatic
const report = await runTests({
  testDir: './tests',
  ai: {
    provider: 'anthropic',
    apiKey: process.env.ANTHROPIC_API_KEY,
    model: 'claude-3-haiku-20240307',
  },
});

Supported models: claude-3-5-sonnet-20241022, claude-3-opus-20240229, claude-3-haiku-20240307

Using AI assertions

steps:
  - name: Check todo description
    request:
      method: GET
      url: "{api}/todos/${todoId}"
    assertions:
      # Traditional assertion
      - path: status
        operator: equals
        value: 200
      # AI-powered assertion (works with any provider)
      - path: data.description
        operator: ai-evaluate
        value: "Is this a well-formed task description with a clear action item?"

Context file AI config

## AI Configuration
- provider: ollama
- url: http://localhost:11434
- model: llama3.2:3b

# Or for cloud providers:
# provider: openai
# apiKey: ${OPENAI_API_KEY}
# model: gpt-4o-mini

🔒 Privacy note: Ollama runs entirely locally. OpenAI and Anthropic send data to their APIs. Choose based on your privacy requirements.

AI assertions in CI (recommended settings)

AI checks can be non-deterministic. For CI, prefer:

Deterministic settings (e.g. temperature: 0 for OpenAI/Anthropic)
Short, specific prompts (avoid vague questions)
Stable models (avoid preview/beta models)

Example:

ai: {
  provider: 'openai',
  model: 'gpt-4o-mini',
  // OpenAI doesn't expose temperature in our API yet, but use stable models
}

Avoid committing API keys. Use environment variables (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY).
The runner redacts common secret fields in logs (Authorization headers, tokens, cookies) when verbose mode is enabled.
Keep sensitive data out of YAML files — use environment variable interpolation or context files with .gitignore.

Example:

headers:
  Authorization: "Bearer ${API_TOKEN}"  # Use env vars

Best practices:

Store secrets in .env files (add to .gitignore)
Use context files for non-sensitive config (base URLs, endpoints)
Never commit API keys or tokens in YAML files

We provide a JSON Schema for *.yaml test flows so you get autocomplete + validation in editors.

VSCode setup (.vscode/settings.json):

{
  "yaml.schemas": {
    "https://raw.githubusercontent.com/carbajalmarcos/testflow-ai/main/schemas/testflow.schema.json": [
      "tests/**/*.yaml",
      "**/*.testflow.yaml"
    ]
  }
}

This gives you:

✅ Autocomplete for name, steps, request, assertions, etc.
✅ Validation for required fields and types
✅ Hover documentation for operators and options

Note: JSON Schema coming in a future release. For now, TypeScript types provide autocomplete via import type { TestFlow } from 'testflow-ai'.

Define your project context in Markdown. The runner uses it to resolve {baseUrlKey} references in your YAML flows.

# Todo List API

## Description
A REST + GraphQL API for managing todo items.

## Base URLs
- api: http://localhost:3000
- graphql: http://localhost:3000/graphql

## Endpoints
- POST /todos - Create todo
- GET /todos/:id - Get todo by ID
- PUT /todos/:id - Update todo
- DELETE /todos/:id - Delete todo
- POST /graphql - GraphQL endpoint

## Rules
- All endpoints return JSON
- Todos have: id, title, completed, createdAt

## AI Configuration
- provider: ollama
- url: http://localhost:11434
- model: llama3.2:3b

testflow-ai works in any CI/CD pipeline:

Exit code 0 = success, 1 = failure (CI will fail automatically)
JSON output: --format json for parsing results
Tag filtering: --tags smoke for faster runs

GitHub Actions

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm ci
      - run: npm install -D testflow-ai
      - run: npm run start:server &
      - run: npx testflow --dir ./tests --context ./context.md

That's it. If tests fail, the job fails automatically (exit code 1).

Console Output

══════════════════════════════════════════════════════════════
  TESTFLOW AI — RESULTS
══════════════════════════════════════════════════════════════

Summary:
  Total:    3 flows
  Passed:  2
  Failed:  1
  Duration: 1850ms

Narrative:

PASS **Todo CRUD**
   → Create todo
     CAPTURE todoId: abc-123
   → Read todo
   → Update todo
   → Delete todo

PASS **Todo GraphQL**
   → Create todo (mutation)
     CAPTURE todoId: def-456
   → Query todo

FAIL **Todo Bulk Import**
   ✗ Import todos from CSV
     WARN: Expected status to equal 200, got 500

══════════════════════════════════════════════════════════════

[ ] Database assertions (verify records directly via SQL)
[ ] gRPC / RPC support
[ ] OpenAPI spec → auto-generate test flows
[ ] Watch mode (re-run on file change)
[ ] Parallel flow execution
[ ] HTML report output
[ ] testflow init wizard

Examples

See the examples/ directory for:

REST CRUD (rest-crud.yaml) — Full todo lifecycle: create → read → update → verify
Auth Flow (auth-flow.yaml) — Login, create todo with token, verify access control
GraphQL Flow (graphql-flow.yaml) — Create + query todos via GraphQL mutations
Todo CRUD (todo-crud.yaml) — Extended CRUD with delete + verify deletion
Todo GraphQL (todo-graphql.yaml) — GraphQL mutations and queries with variable capture
Context Files (context.md, todo-list-context.md) — API context templates

Quick start with examples:

# Run a specific flow
npx testflow --context ./examples/context.md ./examples/rest-crud.yaml

# Run the auth flow
npx testflow --context ./examples/context.md ./examples/auth-flow.yaml

# Run all examples
npx testflow --dir ./examples --context ./examples/context.md

Made with ❤️ by Marcos Carbajal

⭐ Star on GitHub • 📦 npm • 🐛 Report a bug • 💬 Discussions

Support

If testflow-ai saved you time, consider supporting its development:

Bitcoin (BTC): bc1qv0ddjg3wcgujk9ad66v9msz8manu5tanhvq0fn
USDT (ERC-20): 0x79F57C9D45d2D40420EF071DDAaA27057618E7C8

Every contribution helps keep the project moving. Thank you!