@teichai/datagen
v0.1.11
Published
A easy to use CLI to generate JSONL datasets from a TXT file using LLMs.
Downloads
1,391
Maintainers
Readme
DataGen - By TeichAI
A easy to use CLI to generate JSONL datasets from a TXT file using LLMs.
Install
npm i -g @teichai/datagenOr install locally and run via npx:
npm i -D @teichai/datagen
npx datagen --helpRun tests:
npm testUsage
Set your OpenRouter API key:
export API_KEY="your_openrouter_key"Create a prompts file where each line is a prompt:
Explain the CAP theorem in simple terms.
Write a Python function to reverse a linked list.Run:
datagen --model openai/gpt-4o-mini --prompts prompts.txtNote: On startup, datagen does a quick best-effort check for a newer npm version and prints an upgrade command if available. Disable with DATAGEN_DISABLE_UPDATE_CHECK=1.
Development (build + run once):
API_KEY="your_openrouter_key" npm run dev -- --model openai/gpt-4o-mini --prompts prompts.txtOptions
--help: show the help message and exit.--version: print the CLI version and exit.--config: set a config file--model <name>: required model name.--prompts <file>: required prompts file.--out <file>: output JSONL (defaultdataset.jsonl).--api <baseUrl>: API base (default OpenRouter).--system <text>: optional system prompt.--store-system true|false: store system message in output (defaulttrue).--concurrent <num>: number of in-flight requests (default1).--openrouter.provider <slugs>: comma-separated provider slugs to try in order (OpenRouter only).--openrouter.providerSort <price|throughput|latency>: provider routing sort (OpenRouter only).--reasoningEffort <none|minimal|low|medium|high|xhigh>: pass through asreasoning.effort.--no-progress: disable the progress bar.
