deepeval

v0.1.31

Published

24 days ago

The LLM Evaluation Framework for TypeScript

0High
0Medium
0Low

penguine-ip

kritinv

DeepEval.ts

TypeScript client for Confident AI's DeepEval API - a framework for evaluating and testing Large Language Models (LLMs).

Installation

npm install deepeval

Authentication

DeepEval.ts requires a Confident AI API key to authenticate with the service. You can set up your API key in one of the following ways:

Option 1: Environment Variables

Set the CONFIDENT_API_KEY environment variable:

# In your terminal
export CONFIDENT_API_KEY="your-api-key-here"

# Or for Windows
set CONFIDENT_API_KEY=your-api-key-here

Option 2: .env File

Create a .env file in your project root:

# .env file
CONFIDENT_API_KEY="your-api-key-here"

Then use a package like dotenv to load it:

npm install dotenv

// At the top of your entry file
import 'dotenv/config';

Option 3: Pass API Key Directly

You can also pass your API key directly when creating an API instance:

import { Api } from 'deepeval/confident';

const api = new Api("your-api-key-here");

Global API Keys (Multi-Project Access)

A global API key authenticates across every project in your organization, so you don't need a separate key per project. Set it as your CONFIDENT_API_KEY (its value starts with confident_<region>_global_), then specify which project each call targets by passing projectId:

import { Prompt, EvaluationDataset } from 'deepeval';

// Pull a prompt from a specific project
const prompt = new Prompt({ alias: 'my-prompt' });
await prompt.pull({ projectId: 'proj_123' });

// Pull a dataset from a specific project
const dataset = new EvaluationDataset();
await dataset.pull({ alias: 'my-dataset', projectId: 'proj_123' });

projectId is accepted by every data method (pull, push, evaluate, and so on). With a regular project-scoped key, omit projectId and each call targets that key's project automatically.

Usage Examples

Working with Datasets

import { EvaluationDataset, LLMTestCase } from 'deepeval';

// Load dataset from CSV
const dataset = new EvaluationDataset();
const csvTestCases = await dataset.addTestCasesFromCSV({
  filePath: 'path/to/dataset.csv',
  inputCol: 'input_column',
  actualOutputCol: 'actual_output_column',
  expectedOutputCol: 'expected_output_column'
});
csvTestCases.forEach((testCase) => dataset.addTestCase(testCase));

// Create dataset programmatically
const customDataset = new EvaluationDataset();
customDataset.addTestCase(
  new LLMTestCase({
    input: "What is the capital of France?",
    actualOutput: "Paris is the capital of France.",
    expectedOutput: "Paris"
  })
);

// Iterate through test cases
for (const testCase of dataset.testCases) {
  console.log(`Input: ${testCase.input}`);
  console.log(`Output: ${testCase.actualOutput}`);
}

API Reference

EvaluationDataset

The EvaluationDataset class manages collections of test cases for LLM evaluation.

// Create a new dataset
const dataset = new EvaluationDataset();

// Add test cases from CSV
const testCases = await dataset.addTestCasesFromCSV({
  filePath,              // Path to CSV file
  inputCol,              // Name of input column
  actualOutputCol,       // Name of actual output column
  expectedOutputCol,     // Name of expected output column (optional)
  contextCol,            // Name of context column (optional)
  contextDelimiter,      // Delimiter for context values (optional)
  retrievalContextCol,   // Name of retrieval context column (optional)
  retrievalContextDelimiter // Delimiter for retrieval context values (optional)
});

testCases.forEach((testCase) => dataset.addTestCase(testCase));

// Add a test case programmatically
dataset.addTestCase(
  new LLMTestCase({
    input: "What is the capital of France?",
    actualOutput: "Paris is the capital of France.",
    expectedOutput: "Paris",
    context: ["France is a country in Europe.", "Paris is a city."],
    retrievalContext: ["Paris is the capital and most populous city of France."]
  })
);

LLMTestCase

The LLMTestCase class represents individual test cases for LLM evaluation.

import { LLMTestCase, ToolCall } from 'deepeval';

const testCase = new LLMTestCase({
  input: "What is the capital of France?",
  actualOutput: "Paris is the capital of France.",
  expectedOutput: "Paris",
  context: ["France is a country in Europe.", "Paris is a city."],
  retrievalContext: ["Paris is the capital and most populous city of France."],
  toolsCalled: [
    new ToolCall({
      name: "search",
      inputParameters: { query: "capital of France" },
      output: { result: "Paris is the capital of France" }
    })
  ]
});

Development

To build the package locally:

npm run build

To run tests:

npm test

License

MIT