npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@leighton-digital/llm-test-tools

v0.1.0-beta.1

Published

A TypeScript library for testing AI responses using AWS Bedrock

Downloads

2

Readme

LLM Test Tools

GitHub license Maintained

A TypeScript library for testing AI responses using Amazon Bedrock Converse API under the hood. This package helps you validate AI-generated responses against specific assertions and analyse their tone, factual content, and confidence level.

Features

  • Test AI generated responses against your own assertions using the AWS Bedrock Converse API
  • Built-in Jest custom matchers for easy assertions
  • Support for any Amazon Bedrock models (Claude, Titan, etc.)
  • Configurable parameters for model invocation (max tokens, temperature, etc.)
  • Tone analysis (neutral, happy, sad, angry)
  • Confidence scoring system (0-10 scale)
  • Assertions on factual content
  • TypeScript support with full type safety

Installation

npm install @leighton-digital/llm-test-tools

Usage

Importing the Package

import { ResponseAssertions, AssertionsMet, Tone } from 'llm-test-tools';

Basic Example

// Create a tester instance with optional AWS region configuration
const tester = new ResponseAssertions({ region: 'eu-west-1' }); // Specify region
// or
const tester = new ResponseAssertions(); // Uses default region 'us-east-1'

// Test an AI response (actual example text to assert on and the prompt assertions)
const assertionResponse = await tester.responseAssertions({
  text: 'Your AI generated text to validate',
  prompt: 'Your prompt text with your assertions',
});

// OR, test an AI response with optional parameters
const assertionResponse = await tester.responseAssertions({
  text: 'Your AI generated text to validate',
  prompt: 'Your prompt text with your assertions',
  modelId: 'anthropic.claude-3-sonnet-20240229', // Optional model override
  temperature: 0.7, // Optional
  maxTokensToSample: 500, // Optional
  topP: 0.9, // Optional
});

// Using custom Jest matcher test the response
expect(assertionResponse).toSatisfyAssertions({
  assertionsMet: AssertionsMet.yes,
  tone: Tone.neutral,
  score: 8, // minimum score
});

// OR, using standard Jest matchers
expect(assertionResponse.assertionsMet).toEqual(AssertionsMet.yes);
expect(assertionResponse.tone).toEqual(Tone.neutral);
expect(assertionResponse.score).toBeGreaterThanOrEqual(8);
expect(assertionResponse.isFactual).toEqual(true);

Jest Matchers

The package includes custom Jest matchers:

// Using the custom matcher
expect(assertionResponse).toSatisfyAssertions({
  assertionsMet: AssertionsMet.yes,
  tone: Tone.neutral,
  score: 8,
  isFactual: true,
});

// Or using standard Jest matchers
expect(assertionResponse.assertionsMet).toBe(AssertionsMet.yes);
expect(assertionResponse.tone).toBe(Tone.neutral);
expect(assertionResponse.score).toBeGreaterThanOrEqual(8);
expect(assertionResponse.isFactual).toEqual(true);

If we use the toSatisfyAssertions Jest matcher and the assertion fails, we will get information as to why as an explanation (for example):

● test › should assert that the text meets the correct assertions

Expected {"assertionsMet": false, "explanation": "The text describes a cat that is black and named Mittens. The assertion to check is that the cat is brown, which is not mentioned in the text. Therefore, the assertions are not met. The text is factually correct in describing the cat's color and name, but it does not address the assertion about the cat being brown. Hence, the score is low due to the lack of matching assertions.", "isFactual": true, "score": 2, "tone": "neutral"} to satisfy {"assertionsMet": true, "isFactual": true, "score": 8, "tone": "neutral"}

Available Assertions Enums

The package provides several built-in assertion enums:

// AssertionsMet enum
export const AssertionsMet = {
  yes: true,
  no: false,
};

// Tone enum
export enum Tone {
  neutral = 'neutral',
  happy = 'happy',
  sad = 'sad',
  angry = 'angry',
}

Configuration Options

You can configure the following parameters when using responseAssertions:

interface ResponseAssertionsInput {
  prompt: string; // The assertions to test against
  text: string; // The AI response text to test
  modelId?: string; // AWS Bedrock model ID (default: 'us.amazon.nova-premier-v1:0')
  maxTokensToSample?: number; // Maximum tokens to generate (default: 500)
  temperature?: number; // Temperature for response generation (default: 0.3)
  topP?: number; // Top-p sampling value (default: 0.7)
}

Running Tests

To run the tests:

npm run test

or in watch mode

npm run test:watch

You can also run the tests with an AWS profile and AWS account ID as shown below:

AWS_PROFILE=my-profile AWS_ACCOUNT_ID=123456789123 AWS_REGION=us-east-1 npm run test

Example Assertions

Here are some example assertion formats:

// Perfect response assertions
"The response should include:\n- The exact height of the Eiffel Tower\n- The year it was built\n- The material it's made of\n- Number of levels open to the public";

// Tone-based assertions
'The response should be positive and enthusiastic about the Eiffel Tower';

// Partial assertions
"The response should at least mention:\n- The Eiffel Tower's location\n- Some interesting facts about it";

Troubleshooting

  1. If you don't have access to the model in the specified account you will get the following error:
AccessDeniedException: You don't have access to the model with the specified model ID.

Ensure that you go to the model access page in Amazon Bedrock and get the required access.

Score Interpretation

The confidence score ranges from 0-10:

  • 10: All assertions perfectly match
  • 7-9: Most assertions strongly match
  • 4-6: Some assertions partially match
  • 0-3: Few or no assertions match

Amazon Bedrock Setup

Make sure you have Amazon Bedrock set up with:

  1. AWS credentials configured, or AWS profile stipulated, or environment variables set
  2. Bedrock service enabled in your region
  3. Model permissions configured

Note: By default we use a model which has cross-region inference which means that if a model is not available in that region or is being throttled, Amazon Bedrock will use the same model in a different region: https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html

License

MIT License - see the LICENSE file for details

Contributing

Contributions are welcome! Please feel free to submit a pull request.