npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

jest-ai

v2.0.0

Published

Custom jest matchers for testing AI applications

Downloads

451

Readme


version downloads MIT License PRs Welcome

Watch on GitHub Star on GitHub

The problem

Development of AI tools and applications is a process which requires a lot of manual testing and prompt tweaking. Not only this, but for many developers the world of AI feels like "uncharted land".

This solution

The jest-ai library provides a set of custom jest matchers that you can use to extend jest. These will allow testing the calls and responses of LLMs in a more familiar way.

Table of Contents

Installation

This module is distributed via npm which is bundled with node and should be installed as one of your project's devDependencies:

npm install --save-dev jest-ai

or

for installation with yarn package manager.

yarn add --dev jest-ai

Usage

First thing first, make sure you have OPENAI_API_KEY set in your environment variables. as this library uses the OpenAI API to run the tests.

Import jest-ai once (for instance in your tests setup file) and you're good to go:

// In your own jest-setup.js (or any other name)
import "jest-ai";

// In jest.config.js add (if you haven't already)
setupFilesAfterEnv: ["<rootDir>/jest-setup.js"];

With @jest/globals

If you are using @jest/globals with injectGlobals: false, you will need to use a different import in your tests setup file:

// In your own jest-setup.js (or any other name)
import "jest-ai/jest-globals";

With TypeScript

If you're using TypeScript, make sure your setup file is a .ts and not a .js to include the necessary types.

You will also need to include your setup file in your tsconfig.json if you haven't already:

  // In tsconfig.json
  "include": [
    ...
    "./jest-setup.ts"
  ],

If TypeScript is not able to resolve the matcher methods, you can add the following to your tsconfig.json:

{
  "compilerOptions": {
    "types": ["jest", "jest-ai"]
  }
}

Custom matchers

toSemanticallyMatch

toSemanticallyMatch();

This allows checking if the response from the AI matches or includes the expected response. It uses semantic comparison, which means that "What is your age?" and "When were you born?" could both pass. This is in order to allow the natural and flexible nature of using AI.

Examples

const response = await ai.getResponse("Hello");
// AI Response: "Hello, I am a chatbot set to help you with information for your flight. Can you please share your flight number with me?"
await expect(response).toSemanticallyMatch("What is your flight number?");

or

await expect("What is your surname?").toSemanticallyMatch(
  "What is your last name?"
);

:warning: This matcher is async: use async await when calling the matcher. This library uses a cosine calculation to check the similarity distance between the two strings. When running semantic match, a range of options can pass/fail. Currently, the threshold is set to 0.75.

toSatisfyStatement

toSatisfyStatement();

This checks if the response from the AI satisfies a simple true of false statement. It uses a custom prompt and a separate chat completion to determine the truthiness of the statement. If the truthiness of the statement cannot be determined from the response, the assertion will fail.

Examples

const response = await ai.getResponse("Hello");
// AI Response: "Hello, I am a chatbot set to help you with information for your flight. Can you please share your flight number with me?"
await expect(response).toSatisfyStatement(
  "It contains a question asking for your flight number."
);

or

await expect("What is your surname?").toSatisfyStatement(
  "It asks for your last name."
);

:warning: This matcher is async: use async await when calling the matcher. This assertion uses the OpenAI chat completion API, using the gpt-4-turbo model by default. As always, be aware of your API usage!

toHaveUsedSomeTools

toHaveUsedSomeTools();

Assert that a Chat Completion response requests the use of a particular tool.

Examples

const getResponse = async () =>
  await ai.getResponse("Will my KL1234 flight be delayed?");
await expect(getResponse).toHaveUsedSomeTools(["get_flight_status"]);
await expect(getResponse).toHaveUsedSomeTools([
  { name: "get_flight_status", arguments: "KL1234" },
]);

:warning: This matcher is async: use async await when calling the matcher. This matcher uses the OpenAI chat completion API to check tool calls.

toHaveUsedSomeAssistantTools

toHaveUsedSomeAssistantTools();

Assert that an Assistants API Run response requests the use of a particular tool.

Examples

const assistant = await openai.beta.assistants.create({
  name: "Weather Reporter",
  instructions: "You are a reporter who answers questions on the weather.",
  tools: [getWeatherTool],
  model: "gpt-3.5-turbo-0125",
});

const thread = await openai.beta.threads.create();
await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "What is the weather in New York City?",
});

let run = await openai.beta.threads.runs.create(thread.id, {
  assistant_id: assistant.id,
});

// Assert on just function name
await expect(run).toHaveUsedSomeAssistantTools(["getWeather"]);

// Assert on function name and arguments
await expect(run).toHaveUsedAllAssistantTools([
  { name: "getWeather", arguments: "New York City" },
]);

:warning: This matcher is async: use async await when calling the matcher This matcher polls the OpenAI Run API to check for tool calls.

toHaveUsedAllTools

toHaveUsedAllTools();

Checks if all the tools given to the LLM were used. Will fail if any of the tools were not used.

Examples

const getResponse = async () =>
  await ai.getResponse("Will my KL1234 flight be delayed?");
await expect(getResponse).toHaveUsedAllTools([
  "get_flight_status",
  "get_flight_delay",
]);
await expect(getResponse).toHaveUsedAllTools([
  { name: "get_flight_status", arguments: "KL1234" },
  { name: "get_flight_delay", arguments: "KL1234" },
]);

:warning: This matcher is async: use async await when calling the matcher This matcher uses the OpenAI chat completion API to check tool calls.

toHaveUsedAllAssistantTools

toHaveUsedAllAssistantTools();

Assert that an Assistants API Run response requests the use of a particular tool.

Examples

const assistant = await openai.beta.assistants.create({
  name: "Weather Reporter",
  instructions: "You are a reporter who answers questions on the weather.",
  tools: [getWeatherTool],
  model: "gpt-3.5-turbo-0125",
});

const thread = await openai.beta.threads.create();
await openai.beta.threads.messages.create(thread.id, {
  role: "user",
  content: "What is the weather in New York City and in San Francisco?",
});

let run = await openai.beta.threads.runs.create(thread.id, {
  assistant_id: assistant.id,
});
// Assert simply on function name
await expect(run).toHaveUsedAllAssistantTools(["getWeather"]);

// Assert on function name and arguments
await expect(run).toHaveUsedAllAssistantTools([
  { name: "getWeather", arguments: "New York City" },
  { name: "getWeather", arguments: "San Francisco" },
]);

:warning: This matcher is async: use async await when calling the matcher This matcher polls the OpenAI Run API to check for tool calls.

toMatchZodSchema

toMatchZodSchema();

Many times, we would like our LLMs to respond in a JSON format that's easier to work with later. This matcher allows us to check if the response from the LLM matches a given Zod schema.

Examples

const response = await ai.getResponse(`
    Name 3 animals, their height, and weight. Response in the following JSON format:
    {
        "animals": [
            {
                "name": "Elephant",
                "height": "3m",
                "weight": "6000kg"
            },
        ]
    }
`);
const expectedSchema = z.object({
  animals: z.array(
    z.object({
      name: z.string(),
      height: z.string(),
      weight: z.string(),
    })
  ),
});
expect(getResponse).toMatchZodSchema(expectedSchema);

LICENSE

MIT