evals
v2.2.8
Published
Arize evals package
Downloads
1,254
Readme
evals
An npm package for Arize evals functionality.
About Arize AI
Arize AI is a leading company in AI observability and evaluation, dedicated to ensuring that artificial intelligence systems operate reliably in real-world applications. Founded in 2019, Arize provides tools that help machine learning teams monitor, troubleshoot, and improve model performance across various domains, including structured data, computer vision, and large language models (LLMs).
Arize Phoenix
Arize Phoenix is an open-source library designed for LLM tracing and evaluation. It enables developers to evaluate, experiment, and optimize AI products in real time. Key features include:
- Application Tracing: Collects LLM application data with automatic instrumentation, providing comprehensive visibility into model operations
- Interactive Prompt Playground: Offers a flexible environment for prompt and model iteration, allowing users to compare prompts, visualize outputs, and debug failures within their workflow
- Streamlined Evaluations and Annotations: Facilitates efficient assessment and documentation of model performance
Phoenix is built on OpenTelemetry, ensuring seamless setup, full transparency, and no vendor lock-in. It's perfect for teams who want to get started with LLM observability and evaluation in a fully local, open-source environment.
Arize AX
Arize AX is the enterprise AI engineering platform that extends the capabilities of Phoenix, offering a comprehensive suite for development, evaluation, and observability. Key features include:
- Prompts: Tools for prompt engineering, including a prompt playground, prompt hub for management and versioning, and a prompt builder powered by AI
- Experiments: Systematic A/B testing of prompts against large datasets to optimize model performance
- Tracing: Detailed tracing capabilities to monitor and debug AI applications effectively
- Evaluation: Comprehensive evaluation metrics and tools to assess model outputs and performance
- Alyx: An AI engineering agent embedded within the platform to assist with various tasks
Arize AX is designed to support teams and organizations with larger data needs, providing robust support, collaboration features, and multiple deployment options including SaaS, Virtual Private Cloud (VPC), and Arize Private Connect.
Installation
npm install evalsThe post-install script will automatically launch the CLI interface.
Usage
Run the CLI manually:
npm startOr use the binary:
npx evals