@zhiweiliu/playwright-generator
v1.0.60
Published
Generate Playwright test cases from natural language using LLM
Maintainers
Readme
Playwright LLM Test Case Generator
⚠️ Version Notice: Versions v1.0.1 to v1.0.55 are development builds and are not recommended for use. Please upgrade to v1.0.56 or above if you are still running these versions.
This module streamlines the generation of Playwright test cases by integrating with Large Language Models. While there are many AI-based test frameworks that allow test cases to be written in natural language, the following drawbacks are commonly found with these approaches.
Table of Contents
- Drawbacks of Current AI-Based Test Frameworks
- Why Playwright + AI Test Case Generator?
- Features
- Prerequisites
- Installation & Quick Start
- Project Structure
- Configuration
- Local LLM Setup (Ollama)
- VS Code Extension
- Contributing
- License
- Support
Drawbacks of Current AI-Based Test Frameworks
Lack of Precision and Accuracy: AI-generated tests may not accurately capture complex user interactions, edge cases, or application-specific logic, leading to false positives or missed bugs.
Maintenance Overhead: As applications evolve, AI-generated tests often require manual updates and refactoring, negating some of the time-saving benefits.
Reliability Issues: Large Language Models can hallucinate or generate incorrect test logic, especially for dynamic web applications with complex state management.
Limited Integration: Many AI-based frameworks lack seamless integration with existing CI/CD pipelines, version control systems, and testing infrastructure.
Cost and Resource Intensity: Frequent API calls to LLM services can become expensive, and there's a dependency on external services that may have rate limits or downtime.
Security Concerns: Sharing application details with external AI services raises potential security risks, especially for proprietary or sensitive codebases.
Why Playwright + AI Test Case Generator?
Playwright combined with AI offers a powerful solution that addresses these drawbacks:
Robust Web Testing Foundation: Playwright provides a reliable, battle-tested framework for end-to-end web testing, handling modern web applications with features like auto-waiting, network interception, and cross-browser support.
Enhanced Test Generation: AI integration allows for natural language test case descriptions, accelerating test creation while maintaining the precision and reliability of Playwright's code generation.
Maintainable Code: Generated Playwright tests are actual code that can be easily reviewed, modified, and maintained by developers, unlike some AI frameworks that produce abstract test descriptions.
Seamless Integration: Playwright integrates well with existing development workflows, CI/CD pipelines, and can be enhanced with AI without compromising existing infrastructure.
Cost-Effective: By generating high-quality Playwright code upfront, this approach reduces the need for continuous AI API calls during test maintenance and execution.
Security-First: The AI integration can be implemented with local models or controlled API usage, minimizing security risks associated with external AI services.
Features
This project implements an npx command in TypeScript and is released as an NPM module @zhiweiliu/playwright-generator. After installing the command, you can easily set up a Playwright framework with AI support by running a command.
npx playwright-generator initThe installed module will have rich features to facilitate your day-to-day test automation tasks.
Prerequisites
- Node.js: Version 16.0 or higher
- npm: Version 7.0 or higher
- Git: For version control integration
- LLM API Access: Depending on your AI model choice:
- Claude: Anthropic API key with Claude access enabled (visit https://console.anthropic.com/)
- Azure OpenAI: Azure subscription with an OpenAI resource and deployment
- ChatGPT: OpenAI API key (visit https://platform.openai.com/)
- Local LLM: Ollama installed locally (no API key required — see Local LLM Setup)
Installation & Quick Start
Install the generator:
npm install -g @zhiweiliu/playwright-generatorInitialize a new project:
npx playwright-generator initNote:
initnow createstests/example.test.mdwith comprehensive SauceDemo e-commerce test cases by default.Configure your environment:
cp .env.example .env # Edit .env with your API credentialsWrite your first test case in the
tests/folderGenerate Playwright code:
npx playwright-generator generate --tc TC-0001
Project Structure
project-root/
├── tests/ # Natural language test cases (includes sample test cases)
│ └── *.test.md
├── helpers/ # Natural language helper definitions
│ └── *.md
├── generated/ # Generated Playwright test code
│ ├── generated.test.ts
│ └── helpers/ # Generated helper classes
│ └── *.ts
├── audit/ # Screenshots and artifacts from failed tests
│ └── screenshots/
├── .env # Environment variables (local only, ignored by Git)
├── .env.example # Example environment file
├── playwright.config.ts # Playwright configuration
└── package.json
Configuration
Environment Variables (.env)
# AI Model Configuration
AI_MODEL=claude # Options: claude, azure-openai, chatgpt, local
CLAUDE_API_KEY=sk-ant-... # Required if using Claude (starts with sk-ant-)
AZURE_OPENAI_API_KEY= # Required if using Azure OpenAI
AZURE_OPENAI_ENDPOINT= # e.g. https://<resource>.openai.azure.com
AZURE_OPENAI_DEPLOYMENT= # e.g. gpt-4o
AZURE_OPENAI_API_VERSION=2024-02-01 # Optional, defaults to 2024-02-01
CHATGPT_API_KEY= # Required if using ChatGPT (starts with sk-)
CHATGPT_MODEL=gpt-4o # Optional, defaults to gpt-4o
LOCAL_LLM_URL=http://localhost:11434 # Required if using local LLM (Ollama default)
LOCAL_LLM_MODEL=llama3 # Required if using local LLM (e.g. llama3, codellama, qwen2.5-coder)
# Playwright Configuration
BROWSER=chromium # Options: chromium, firefox, webkit
HEADLESS=true # Run in headless mode
BASE_URL=http://localhost:3000 # Application under test URL
# Execution Configuration
TIMEOUT=30000 # Test timeout in milliseconds
RETRIES=1 # Number of retries on failurePreset Test Framework
- A fully working Playwright test automation framework is already set up
Test Cases Written in Natural Language
- Test cases are stored in the
tests/folder under the project root using.test.mdfiles - Tags with format
[TAG-NAME]can be applied to natural language test cases to allow grouping and running related tests - Each test case must have a unique ID tag in format
[TC-xxxx](e.g.,[TC-0001]), which enables running specific test cases - You can specify an output file in the
generated/folder; otherwise, generated code will output togenerated.test.ts - Natural language descriptions should be clear and specific to improve AI-generated code quality
Example Test Case File (tests/login.test.md):
[TC-0001] [SMOKE] [LOGIN]
# User logs in with valid credentials
- Given the user is on the login page
- When the user enters valid username and password
- And clicks the login button
- Then the user should be redirected to the dashboardSample Test Cases
For reference and testing purposes, sample test cases are provided in the tests/ folder:
tests/example.test.md: Comprehensive test cases for the SauceDemo e-commerce website (https://saucedemo.com), including complete purchase flow and product browsing scenarios with detailed step-by-step descriptions.
You can use these samples to:
- Test the generator with real-world e-commerce scenarios
- Understand the level of detail needed in natural language descriptions
- Generate Playwright code for immediate use and validation
To generate code from a sample test case:
npx playwright-generator generate --tc TC-SAMPLE-0001Generation
Playwright test automation code is generated by running a command, with generated code placed in the generated/ folder under the project root.
- Generated Playwright code is in TypeScript
- The test case ID tag must be specified with the generation command, allowing generation of one test case at a time
- If an output file is specified, the command will either append the generated test case to the file or update it if it already exists
- If no output file is specified, the command will either append to
generated.test.tsor update the test case if it exists - The following AI models are supported:
- Claude: Anthropic Claude 3 Haiku API (widely available)
- Azure OpenAI: Azure-hosted OpenAI models (e.g. gpt-4o)
- ChatGPT: OpenAI API (e.g. gpt-4o)
- Local LLM: Any Ollama-compatible local model (e.g. llama3, codellama, qwen2.5-coder, deepseek-coder-v2)
The generator now strips Markdown and explanation text from model output automatically, keeping only the extracted TypeScript test function before writing to generated/*.test.ts.
- Credentials (LLM API keys, usernames, passwords) are retrieved from environment variables in the
.envfile for local development; the.envfile should be ignored by Git
Generation Commands:
# Generate test code (default: claude)
npx playwright-generator generate --tc TC-0001
# Generate with Azure OpenAI
npx playwright-generator generate --tc TC-0001 --model azure-openai
# Generate with ChatGPT
npx playwright-generator generate --tc TC-0001 --model chatgpt
# Generate with local LLM
npx playwright-generator generate --tc TC-0001 --model local
# Generate to specific output file
npx playwright-generator generate --tc TC-0001 --output login.test.ts
# Generate multiple test cases
npx playwright-generator generate --tc TC-0001,TC-0002,TC-0003Helper Generation
Helpers are reusable TypeScript classes that encapsulate common Playwright actions (e.g. login, navigation, form filling). They are defined in natural language in the helpers/ folder and generated into generated/helpers/ as TypeScript classes.
Helper Definition Format (helpers/LoginHelper.md):
[HELPER: LoginHelper]
# This is a helper class for login related actions
[HELPER-ACTION: login]
## This action is for logging in a user
- go to https://somewebsite.com
- URL https://somewebsite.com/login should be loaded
- input username as "user1"
- input password as "password"
- click the button with text "Log in"
- URL https://somewebsite.com/home should be loaded
[HELPER-ACTION: logout]
## This action is for logging out a user
- click the button with text "Logout"
- URL https://somewebsite.com/login should be loadedFormat Rules:
- First non-empty line:
[HELPER: HelperName]— name must start with a letter, contain only letters, numbers, and underscores - Second non-empty line:
# Description— description of the helper class - Each action starts with
[HELPER-ACTION: actionName]— same naming rules as helper name - Action description:
## Descriptionon the next non-empty line - Action details: bullet points describing the steps
- Each action is generated as a
static asyncmethod on the helper class
Generation Commands:
# Generate a helper class (default: claude)
npx playwright-generator generate-helper LoginHelper
# Generate with a specific model
npx playwright-generator generate-helper LoginHelper --model azure-openai
npx playwright-generator generate-helper LoginHelper --model chatgpt
npx playwright-generator generate-helper LoginHelper --model localOutput: The generated helper class is written to generated/helpers/LoginHelper.ts:
import { Page } from "@playwright/test";
/**
* This is a helper class for login related actions
*/
export class LoginHelper {
static async login(page: Page): Promise<void> {
// generated implementation
}
static async logout(page: Page): Promise<void> {
// generated implementation
}
}Using helpers in test cases: Reference the generated helper in your natural language test case descriptions, and the LLM will use it when generating test code:
[TC-0001] [SMOKE] [LOGIN]
# User completes a purchase after logging in
- Use LoginHelper.login() to log in
- When the user adds an item to the cart
- Then the user completes checkoutHelper Methods with Parameters
The LLM provider supports generating helper methods that accept multiple parameters beyond the required Page object. Each helper method can have:
- First parameter (required):
page: Page- The Playwright Page object - Additional parameters: Specified in the helper definition with custom types
Adding Parameters to Helper Methods:
Use the [HELPER-PARAMS:] tag on the line following [HELPER-ACTION:] to specify parameters:
[HELPER: LoginHelper]
# Provides login helper methods
[HELPER-ACTION: loginWithCredentials]
[HELPER-PARAMS: username: string, password: string]
## Logs in to the application with given credentials
This method should:
- Navigate to the login page
- Enter the username in the username field
- Enter the password in the password field
- Click the login button
- Wait for the dashboard to appear
Details:
- Username field selector: #username
- Password field selector: #password
- Login button selector: button[type="submit"]Generated Code with Parameters:
static async loginWithCredentials(page: Page, username: string, password: string): Promise<void> {
// Implementation with all three parameters available
}Usage in Tests:
import { LoginHelper } from "./helpers/LoginHelper";
test("TC-001 User login", async ({ page }) => {
await LoginHelper.loginWithCredentials(
page,
"[email protected]",
"password123",
);
// Rest of test...
});Multiple Parameters:
Specify multiple parameters separated by commas:
[HELPER-PARAMS: param1: string, param2: number, param3: boolean, param4: string[]]Supported Parameter Types:
The LLM respects TypeScript type annotations:
- Primitive types:
string,number,boolean - Arrays:
string[],number[] - Custom types: Any TypeScript type your project uses
- Union types:
'success' | 'error' | 'warning' - Interfaces:
LoginCredentials,UserData, etc.
Parameter Guidelines:
- Always use
page: Pageas the first parameter (automatically handled) - Use descriptive parameter names that indicate their purpose
- Keep TypeScript types consistent with your codebase
- Include parameter descriptions in the "Action details" section if needed
- The LLM is instructed to use all provided parameters in the implementation
Backwards Compatibility:
Existing helper definitions without the [HELPER-PARAMS:] tag continue to work, generating methods with only the page: Page parameter.
Test Results & Reporting
- Test execution produces detailed reports with pass/fail status
- HTML reports are generated in the
reports/folder - Screenshots for failed test cases are automatically captured and stored in the
audit/folder - Videos for failed test cases can be captured (disabled by default; enable with
--videooption) - Test results can be exported in multiple formats (JSON, XML, HTML)
- Audit artifacts help with debugging and root cause analysis of test failures
Execution and Debugging
Various npm scripts are provided to run and debug Playwright test cases.
Since this is essentially a standard Playwright project, nothing prevents you from using the Playwright CLI commands you are already familiar with. The npm scripts below are provided as convenient shortcuts for common tasks, but you can always fall back to running npx playwright test directly with any flags you need.
Available Commands:
# Run all tests
npm run test
# Run tests by tag or test case ID
npm run test:case -- SMOKE
npm run test:case -- TC-0001
# Debug mode (opens inspector)
npm run test:debug -- SMOKE
# Run tests in headed mode with a specific tag
npm run test:headed -- SMOKE
# Run tests with specific browser (set BROWSER=chromium|firefox|webkit in .env)
# and with specific tag
npm run test:browser -- SMOKE
# Run tests with video recording enabled for failed cases (set VIDEO=on-failure in .env) and a specific tag
npm run test:video -- SMOKE
# Generate HTML report
npm run reportAdvanced Features
Parallel Execution:
- Configure parallel test execution in
playwright.config.ts - Default: runs 4 tests in parallel
- Adjust with
fullyParallelandworkerssettings
Cross-Browser Testing:
- Configure browsers in
playwright.config.ts - Default: Chromium
- Supported: Chromium, Firefox, WebKit
CI/CD Integration
GitHub Actions workflow files are created to run Playwright tests automatically:
- On merge to main: Runs full test suite
- Scheduled daily: Runs at midnight UTC every day
- On-demand: Manual trigger available in GitHub Actions UI
Workflow file location: .github/workflows/playwright-tests.yml
Best Practices
- Test Case Naming: Use clear, descriptive names that explain what is being tested
- Tags Strategy: Organize tests with tags (SMOKE, REGRESSION, SANITY, etc.)
- AI Prompting: Write natural language descriptions with:
- Given-When-Then format for clarity
- Specific selectors or UI elements mentioned
- Expected outcomes clearly stated
- Code Review: Always review generated code before committing
- Assertions: Be explicit about what you're asserting
- Helper Classes: Use
generate-helperto create reusable helper classes for common actions, reducing duplication across test cases
Troubleshooting
Common Issues:
| Issue | Solution |
| ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| API key not found | Verify .env file exists and CLAUDE_API_KEY or AZURE_OPENAI_API_KEY is set |
| Claude API connection failed | Check CLAUDE_API_KEY is valid and account has Claude API access; ensure key starts with sk-ant-. Visit https://console.anthropic.com/ to enable Claude API. Uses Claude 3 Haiku which is widely available. |
| Tests timing out | Increase TIMEOUT in .env or use explicit waits in test cases |
| Generated code doesn't compile | Review the natural language description for clarity; regenerate with a refined prompt |
| Tests pass locally but fail in CI | Check BASE_URL environment variable and add debugging with screenshots |
| Selector not found | Ensure selectors are unique and reference current UI state |
| Helper definition not found | Ensure the helper file exists in helpers/ and the [HELPER: Name] tag matches exactly |
| No HELPER-ACTION sections found | Add at least one [HELPER-ACTION: actionName] section to the helper definition file |
| ECONNREFUSED (local LLM) | Ollama server is not running — run ollama serve |
| model not found (local LLM) | Run ollama pull <model> first |
| Slow local LLM generation | Local models are slower than cloud APIs; expect 30–120s; try mistral for speed |
GitHub Integration
- GitHub Actions workflows automatically run Playwright tests when code is merged to the main branch
- Tests also run on a scheduled basis (daily at midnight UTC)
- Workflow status badges can be added to the README to display test status
Local LLM Setup (Ollama)
Run a local LLM with playwright-generator using Ollama, which exposes an OpenAI-compatible API on your machine — no API keys or internet access required.
How It Works
The local provider sends requests to LOCAL_LLM_URL/v1/chat/completions, the standard endpoint exposed by Ollama. Any other local server that implements the same OpenAI-compatible API (e.g. LM Studio, llama.cpp server) will also work.
1. Install Ollama
macOS / Linux:
curl -fsSL https://ollama.com/install.sh | shmacOS (Homebrew):
brew install ollamaWindows: Download the installer from https://ollama.com/download
2. Start the Ollama Server
ollama serveBy default, Ollama listens on http://localhost:11434. The OpenAI-compatible endpoint is available at http://localhost:11434/v1/chat/completions.
3. Pull a Model
# General purpose — good balance of speed and quality
ollama pull llama3
# Smaller / faster — good for low-resource machines
ollama pull mistral
# Code-focused — best results for test generation
ollama pull codellamaTo list all models you have pulled:
ollama list4. Configure .env
AI_MODEL=local
LOCAL_LLM_URL=http://localhost:11434 # Ollama default; change if using a different port
LOCAL_LLM_MODEL=llama3 # Must match the model name you pulled5. Generate a Test Case
npx playwright-generator generate --tc TC-0001 --model localOr set AI_MODEL=local in .env and omit the --model flag:
npx playwright-generator generate --tc TC-0001Recommended Models
| Model | Size | Best For |
| ------------------- | ------- | ----------------------------------------- |
| llama3 | ~4.7 GB | General use, good code quality |
| codellama | ~3.8 GB | Code generation tasks |
| mistral | ~4.1 GB | Fast, low memory usage |
| llama3:8b | ~8 GB | Higher quality output |
| deepseek-coder | ~3.8 GB | Code-focused, strong TypeScript |
| deepseek-coder-v2 | ~8.9 GB | Stronger code quality than v1 |
| deepseek-r1 | ~4.7 GB | Reasoning-focused, good for complex logic |
| qwen2.5-coder | ~4.7 GB | Strong TypeScript/code generation |
| qwen2.5 | ~4.7 GB | General use, multilingual |
Qwen Models (by Alibaba)
Qwen models perform well for code generation tasks and are fully supported by Ollama.
ollama pull qwen2.5-coder # Recommended for code generation
ollama pull qwen2.5-coder:7b # Larger variant for higher quality
ollama pull qwen2.5 # General purposeAI_MODEL=local
LOCAL_LLM_URL=http://localhost:11434
LOCAL_LLM_MODEL=qwen2.5-coderTip:
qwen2.5-coderis specifically trained on code and tends to produce cleaner TypeScript than general-purpose models of the same size.
DeepSeek Models (by DeepSeek AI)
DeepSeek offers both code-focused and reasoning models, both fully supported by Ollama.
ollama pull deepseek-coder-v2 # Code-focused (recommended for test generation)
ollama pull deepseek-r1 # Reasoning model — good for complex multi-step test logic
ollama pull deepseek-coder # Smaller/faster code modelAI_MODEL=local
LOCAL_LLM_URL=http://localhost:11434
LOCAL_LLM_MODEL=deepseek-coder-v2Tip:
deepseek-r1uses chain-of-thought reasoning internally, which can improve accuracy for complex test cases but is slower thandeepseek-coder-v2.
Using LM Studio (Alternative)
LM Studio also exposes an OpenAI-compatible server:
- Download and install LM Studio from https://lmstudio.ai
- Download a model inside the app (e.g.
Meta-Llama-3-8B-Instruct) - Start the local server from the Local Server tab (default port:
1234) - Set in
.env:
AI_MODEL=local
LOCAL_LLM_URL=http://localhost:1234
LOCAL_LLM_MODEL=Meta-Llama-3-8B-Instruct # Must match the model name shown in LM StudioNote: Local models are slower than cloud APIs — expect 30–120 seconds per generation depending on hardware. No data leaves your machine.
VS Code Extension
A VS Code extension is available to provide a graphical interface for playwright-generator — configure your AI model, generate test cases and helper classes, and run your tests without leaving the editor.
Install from the Marketplace:
Playwright Generator — VS Code Extension
Or search for Playwright Generator in the VS Code Extensions panel (Ctrl+Shift+X).
Features
- Config tab — configure AI model credentials and Playwright settings; changes are auto-saved to
.env - Generate tab — browse and search test case IDs from
tests/; generate TypeScript test code with one click - Helpers tab — browse helper definitions from
helpers/; see which actions have been generated and generate missing ones - Run tab — run all tests, run by tag, run with UI, debug, and view the last HTML report
Requirements
- A workspace initialised with
npx playwright-generator init - A
.envfile in the workspace root
Getting Started
- Install the extension from the Marketplace
- Open your
playwright-generatorproject in VS Code - Click the Playwright Generator icon in the Activity Bar
- Configure your AI model in the Config tab
- Select a test case and click Generate in the Generate tab
- Run your tests from the Run tab
Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Make your changes and write tests
- Commit with clear messages:
git commit -m "Add feature: description" - Push to your fork and submit a Pull Request
Development Setup
# Clone the repository
git clone https://github.com/yourusername/playwright-generator.git
cd playwright-generator
# Install dependencies
npm install
# Build the project
npm run build
# Run tests
npm testLicense
MIT License - See LICENSE file for details
Support
For issues, questions, or suggestions, please open an issue on GitHub or contact the maintainers.
