chat-about-video
v5.6.0
Published
Chat about a video clip using ChatGPT hosted in OpenAI or Azure, or Gemini provided by Google
Maintainers
Readme
chat-about-video
Chat about zero or one or more video clip(s) using the powerful OpenAI ChatGPT (hosted in OpenAI or Microsoft Azure) or Google Gemini (hosted in Google Could). It provides a standardized interface for interacting with OpenAI ChatGPT (OpenAI or Azure) and Google Gemini,
chat-about-video is a powerful Unified Abstraction Layer designed to accelerate the development of conversational AI applications. It provides a standardized interface for interacting with OpenAI ChatGPT (OpenAI or Azure) and Google Gemini, allowing you to switch between providers with zero or minimal changes to your application logic.
Why use chat-about-video?
- Provider Agnostic: Write your code once and swap between ChatGPT and Gemini via configuration. This future-proofs your application against model changes or pricing shifts.
- Unified Video Handling: Seamlessly handles the complexities of frame extraction and cloud storage uploading (for ChatGPT) or direct ingestion (for Gemini) through a single API.
- Simplified Tool Calling: A standardized way to define and handle tool/function calls across different model providers.
- Production Ready: Built-in retries for throttling, server errors, and connectivity issues.
Key features
- Switch providers effortlessly: Change from ChatGPT to Gemini (or vice-versa) without rewritten your conversation logic.
- Multi-Cloud Support: Supports models hosted in Azure OpenAI, OpenAI, and Google Cloud.
- Flexible Media Input: Extract frames automatically via FFmpeg or supply your own images.
- Rich Conversations: Supports multiple videos and image groups in a single chat.
- Mandated Output: Force JSON responses with or without schemas.
- Resilient: Automatic backoff and retries for 429, 5xx, and network errors.
- Usage Tracking: Built-in token usage metadata collection.
Usage
Installation (quick start)
To use chat-about-video in your Node.js application,
add it as a dependency along with other necessary packages based on your usage scenario.
Below are examples for typical setups:
# ChatGPT on OpenAI or Azure with Azure Blob Storage
npm i chat-about-video openai @ffmpeg-installer/ffmpeg @azure/storage-blob
# Gemini in Google Cloud
npm i chat-about-video @google/generative-ai @ffmpeg-installer/ffmpeg
# ChatGPT on OpenAI or Azure with AWS S3
npm i chat-about-video openai @ffmpeg-installer/ffmpeg @handy-common-utils/aws-utils @aws-sdk/s3-request-presigner @aws-sdk/client-s3If ffmpeg binary is already available, you don't need to add dependency @ffmpeg-installer/ffmpeg.
Optional dependencies
ChatGPT
To use ChatGPT hosted on OpenAI or Azure:
npm i openaiGemini
To use Gemini hosted on Google Cloud:
npm i @google/generative-aiffmpeg
If you need ffmpeg for extracting video frame images, ensure it is installed. You can use a system package manager or an NPM package:
sudo apt install ffmpeg
# or
npm i @ffmpeg-installer/ffmpegAzure Blob Storage
To use Azure Blob Storage for frame images (not needed for Gemini):
npm i @azure/storage-blobAWS S3
To use AWS S3 for frame images (not needed for Gemini):
npm i @handy-common-utils/aws-utils @aws-sdk/s3-request-presigner @aws-sdk/client-s3How the video is provided to ChatGPT or Gemini
ChatGPT
chat-about-video supports uploading video frames into cloud storage and making them available to ChatGPT.
- Integrate ChatGPT from Microsoft Azure or OpenAI effortlessly.
- Utilize ffmpeg integration provided by this package for frame image extraction or opt for a DIY approach.
- Store frame images with ease, supporting Azure Blob Storage and AWS S3.
- Models hosted in Azure seems to allow less number of images per request than models hosted in OpenAI.
Gemini
chat-about-video supports sending video frames directly to Google's API without requiring cloud storage.
- Utilize ffmpeg integration provided by this package for frame image extraction or opt for a DIY approach.
- The number of frame images is only limited by the Gemini API in Google Cloud.
Concrete types and low level clients
ChatAboutVideo and Conversation are generic classes.
Use them without concrete generic type parameters when you want the flexibility to easily switch between ChatGPT and Gemini.
Otherwise, you may want to use concrete type. Below are some examples:
// cast to a concrete type
const castToChatGpt = chat as ChatAboutVideoWithChatGpt;
// you can also just leave the ChatAboutVideo instance generic, but narrow down the conversation type
const conversationWithGemini = (await chat.startConversation(...)) as ConversationWithGemini;
const conversationWithChatGpt = await (chat as ChatAboutVideoWithChatGpt).startConversation(...);To access the underlying API wrapper, use the getApi() function on the ChatAboutVideo instance.
To get the raw API client, use the getClient() function on the awaited object returned from getApi().
Cleaning up
Intermediate files, such as extracted frame images, can be saved locally or in the cloud.
To remove these files when they are no longer needed, remember to call the end() function
on the Conversation instance when the conversion finishes.
Mandating JSON response
JSON response can be guaranteed either with a JSON Schema or without. Below example code works for both ChatGPT and Gemini:
// Without specifying a JSON schema
const explanation = await conversation.say(
'Explain your answer. The response should be in JSON like this: {"referencedFrames": [1, 5], "why": "Reason for giving this response."}',
{ jsonResponse: true },
);
console.log(chalk.grey("\nAI's Explanation: " + JSON.stringify(JSON.parse(explanation!), null, 2)));
// With a JSON schema
const detailedExplanation = await conversation.say('Explain your answer in detail. The response should be in JSON.', {
jsonResponse: {
name: 'DetailedExplanation',
schema: {
type: 'object',
properties: {
referencedFrames: {
type: 'array',
items: { type: 'integer' },
},
understandingOfTheQuestion: { type: 'string' },
reasoningSteps: { type: 'array', items: { type: 'string' } },
},
required: ['referencedFrames', 'understandingOfTheQuestion', 'reasoningSteps'],
},
},
});
console.log(chalk.grey("\nAI's detailed explanation: " + JSON.stringify(JSON.parse(detailedExplanation!), null, 2)));Tool Calling (Function Calling)
chat-about-video supports tool calling for both ChatGPT and Gemini. This allows the AI to request information by calling functions you've defined.
1. Define Tools
Pass your tool definitions in the completion options. The structure follows the underlying API (OpenAI or Gemini):
const tools = [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get the current weather',
parameters: {
type: 'object',
properties: {
location: { type: 'string' },
},
required: ['location'],
},
},
},
];
const answer = await conversation.say("What's the weather like in Melbourne?", { tools });2. Handle Tool Calls
The say and submitToolCallResults methods will return an object containing toolCalls if the AI wants to call tools. You are responsible for executing the tools and submitting the results back.
import { ConversationResponse, ToolCallResult } from 'chat-about-video';
let response = await conversation.say('What is the weather in Melbourne?', { tools });
// Loop to handle potential multiple rounds of tool calling
while (typeof response !== 'string' && response?.toolCalls) {
const toolResults: ToolCallResult[] = [];
for (const call of response.toolCalls) {
console.log(`AI requests tool: ${call.name}(${JSON.stringify(call.arguments)})`);
// Execute your tool logic
const result = await myWeatherFunction(call.arguments.location);
toolResults.push({
name: call.name,
result: { temperature: result.temp, unit: 'C' },
toolCallId: call.id, // Required for OpenAI
});
}
// Submit results back to the AI
response = await conversation.submitToolCallResults(toolResults);
}
// Final text response
console.log('AI Answer:', response);Customisation
Frame extraction
If you would like to customise how frame images are extracted and stored, consider these:
- In the options object passed to the constructor of
ChatAboutVideo, there's a propertyextractVideoFrames. This property allows you to customise how frame images are extracted.format,interval,limit,width,height- These allows you to specify your expectation on the extraction.deleteFilesWhenConversationEnds- This flag allows you to specify whether you want extracted frame images to be deleted from the local file system when the conversation ends, or not.framesDirectoryResolver- You can supply a function for determining where extracted frame image files should be stored locally.extractor- You can supply a function for doing the extraction.
- In the options object passed to the constructor of
ChatAboutVideo, there's a propertystorage. For ChatGPT, storing frame images in the cloud is recommended. You can use this property to customise how frame images are stored in the cloud.azureStorageConnectionString- If you would like to use Azure Blob Storage, you need to put the connection string in this property. If this property does not have a value,ChatAboutVideowould assume that you'd like to use AWS S3, and default AWS identity/credential will be picked up from the OS.storageContainerName,storagePathPrefix- They allows you to specify where those images should be stored.downloadUrlExpirationSeconds- For images stored in the cloud, presigned download URLs with expiration are generated for ChatGPT to access. This property allows you to control the expiration time.deleteFilesWhenConversationEnds- This flag allows you to specify whether you want extracted frame images to be deleted from the cloud when the conversation ends, or not.uploader- You can supply a function for uploading images into the cloud.
Settings of the underlying model
In the options object passed to the constructor of ChatAboutVideo, there's a property clientSettings,
and there's another property completionSettings. Settings of the underlying model can be configured
through those two properties.
You can also override settings using the last parameter of startConversation(...) function on ChatAboutVideo,
or the last parameter of say(...) function on Conversation.
Code examples
The following integration test files demonstrate various features and providers:
| File | AI Provider | Features | | :---------------------------------------------------------------------------------------------------------- | :--------------- | :----------------------------------- | | chatgpt-openai-azure-storage.ts | ChatGPT (OpenAI) | Basic usage with Azure Storage | | chatgpt-openai-azure-storage-multi-video.ts | ChatGPT (OpenAI) | Multiple videos in one conversation | | chatgpt-azure-azure-storage-json.ts | ChatGPT (Azure) | JSON response mode | | gemini-json.ts | Google Gemini | JSON response mode | | chatgpt-manual-frames.ts | ChatGPT | Manual frame extraction using FFmpeg | | chatgpt-azure-azure-storage-tools.ts | ChatGPT (Azure) | Tool/Function calling | | gemini-tools.ts | Google Gemini | Tool/Function calling |
Example 1: Using ChatGPT hosted in OpenAI with Azure Blob Storage
Source: test/integration/chatgpt-openai-azure-storage.ts
// This is a demo utilising ChatGPT hosted in OpenAI.
// Video frame images are uploaded to Azure Blob Storage and then made available to GPT from there.
//
// This script can be executed with a command line like this from the project root directory:
// export OPENAI_API_KEY=...
// export AZURE_STORAGE_CONNECTION_STRING=...
// export OPENAI_MODEL_NAME=...
// export AZURE_STORAGE_CONTAINER_NAME=...
// ENABLE_DEBUG=true DEMO_VIDEO=~/Downloads/test1.mp4 npx ts-node test/integration/chatgpt-openai-azure-storage.ts
//
import { consoleWithColour } from '@handy-common-utils/misc-utils';
import chalk from 'chalk';
import readline from 'node:readline';
import { ChatAboutVideo, ConversationWithChatGpt } from '../src';
async function demo() {
const chat = new ChatAboutVideo(
{
credential: {
key: process.env.OPENAI_API_KEY!,
},
storage: {
azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
storageContainerName: process.env.AZURE_STORAGE_CONTAINER_NAME || 'vision-experiment-input',
storagePathPrefix: 'video-frames/',
},
completionOptions: {
// model is required by OpenAI
model: process.env.OPENAI_MODEL_NAME || 'gpt-4o', // 'gpt-4-vision-preview', // or gpt-4o
},
extractVideoFrames: {
limit: 100,
interval: 2,
},
},
consoleWithColour({ debug: process.env.ENABLE_DEBUG === 'true' }, chalk),
);
const conversation = (await chat.startConversation(process.env.DEMO_VIDEO!)) as ConversationWithChatGpt;
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));
while (true) {
const question = await prompt(chalk.red('\nUser: '));
if (!question) {
continue;
}
if (['exit', 'quit', 'q', 'end'].includes(question)) {
await conversation.end();
break;
}
const answer = await conversation.say(question, { max_tokens: 2000 });
console.log(chalk.blue('\nAI:' + answer));
}
console.log('Demo finished');
rl.close();
}
demo().catch((error) => console.log(chalk.red(JSON.stringify(error, null, 2))));Example 2: Multiple videos using ChatGPT hosted in OpenAI with Azure Blob Storage
Source: test/integration/chatgpt-openai-azure-storage-multi-video.ts
async function demo() {
...
const conversation = (await chat.startConversation([
{ videoFile: process.env.DEMO_VIDEO_1!, promptText: 'This is the first video:' },
{ videoFile: process.env.DEMO_VIDEO_2!, promptText: 'This is the second video:' },
{ videoFile: process.env.DEMO_VIDEO_1!, promptText: 'This is the third video:' },
])) as ConversationWithChatGpt;
...
}Example 3: Using ChatGPT hosted in Azure with Azure Blob Storage
Source: test/integration/chatgpt-azure-azure-storage-json.ts
// This is a demo utilising ChatGPT hosted in Azure.
// Video frame images are uploaded to Azure Blob Storage and then made available to GPT from there.
//
// This script can be executed with a command line like this from the project root directory:
// export AZURE_OPENAI_API_ENDPOINT=..
// export AZURE_OPENAI_API_KEY=...
// export AZURE_OPENAI_DEPLOYMENT_NAME=...
// export AZURE_STORAGE_CONNECTION_STRING=...
// export AZURE_STORAGE_CONTAINER_NAME=...
// ENABLE_DEBUG=true DEMO_VIDEO=~/Downloads/test1.mp4 npx ts-node test/integration/chatgpt-azure-azure-storage-json.ts
import { consoleWithColour } from '@handy-common-utils/misc-utils';
import chalk from 'chalk';
import readline from 'node:readline';
import { ChatAboutVideo, ConversationWithChatGpt } from '../src';
async function demo() {
const chat = new ChatAboutVideo(
{
endpoint: process.env.AZURE_OPENAI_API_ENDPOINT!,
credential: {
key: process.env.AZURE_OPENAI_API_KEY!,
},
storage: {
azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
storageContainerName: process.env.AZURE_STORAGE_CONTAINER_NAME || 'vision-experiment-input',
storagePathPrefix: 'video-frames/',
},
clientSettings: {
// deployment is required by Azure
deployment: process.env.AZURE_OPENAI_DEPLOYMENT_NAME || 'gpt4vision',
// apiVersion is required by Azure
apiVersion: '2024-10-21',
},
},
consoleWithColour({ debug: process.env.ENABLE_DEBUG === 'true' }, chalk),
);
const conversation = (await chat.startConversation(process.env.DEMO_VIDEO!)) as ConversationWithChatGpt;
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));
while (true) {
const question = await prompt(chalk.red('\nUser: '));
if (!question) {
continue;
}
if (['exit', 'quit', 'q', 'end'].includes(question)) {
await conversation.end();
break;
}
const answer = await conversation.say(question, { max_tokens: 2000 });
console.log(chalk.blue('\nAI:' + answer));
}
console.log('Demo finished');
rl.close();
}
demo().catch((error) => console.log(chalk.red(JSON.stringify(error, null, 2))));Example 4: Using Gemini hosted in Google Cloud
Source: test/integration/gemini-json.ts
// This is a demo utilising Google Gemini through Google Generative Language API.
// Google Gemini allows many frame images to be supplied because of its huge context length.
// Video frame images are sent through Google Generative Language API directly.
//
// This script can be executed with a command line like this from the project root directory:
// export GEMINI_API_KEY=...
// ENABLE_DEBUG=true DEMO_VIDEO=~/Downloads/test1.mp4 npx ts-node test/integration/gemini-json.ts
import { consoleWithColour } from '@handy-common-utils/misc-utils';
import chalk from 'chalk';
import readline from 'node:readline';
import { HarmBlockThreshold, HarmCategory } from '@google/generative-ai';
import { ChatAboutVideo, ConversationWithGemini } from '../src';
async function demo() {
const chat = new ChatAboutVideo(
{
credential: {
key: process.env.GEMINI_API_KEY!,
},
clientSettings: {
modelParams: {
model: 'gemini-2.5-flash',
},
},
extractVideoFrames: {
limit: 100,
interval: 0.5,
},
completionOptions: {
safetySettings: [
{
category: 'HARM_CATEGORY_HATE_SPEECH' as any,
threshold: 'BLOCK_NONE' as any,
},
],
},
},
consoleWithColour({ debug: process.env.ENABLE_DEBUG === 'true' }, chalk),
);
const conversation = (await chat.startConversation(process.env.DEMO_VIDEO!)) as ConversationWithGemini;
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));
while (true) {
const question = await prompt(chalk.red('\nUser: '));
if (!question) {
continue;
}
if (['exit', 'quit', 'q', 'end'].includes(question)) {
await conversation.end();
break;
}
const answer = await conversation.say(question, {
safetySettings: [{ category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold: HarmBlockThreshold.BLOCK_NONE }],
});
console.log(chalk.blue('\nAI:' + answer));
}
console.log('Demo finished');
rl.close();
}
demo().catch((error) => console.log(chalk.red(JSON.stringify(error, null, 2)), error));Example 5: Multiple groups of extracted frame images using ChatGPT hosted in Azure with Azure Blob Storage
Source: test/integration/chatgpt-manual-frames.ts
async function demo() {
const tmpDir = os.tmpdir();
const video1 = process.env.DEMO_VIDEO_1!;
const video2 = process.env.DEMO_VIDEO_2!;
const outputDir1 = path.join(tmpDir, 'video1-frames');
const outputDir2 = path.join(tmpDir, 'video2-frames');
console.log(chalk.green('Extracting frames from the first video...'));
const { relativePaths: frames1, cleanup: cleanupFrames1 } = await extractVideoFramesWithFfmpeg(video1, outputDir1, 1, 'jpg', 200);
console.log(chalk.green('Extracting frames from the second video...'));
const { relativePaths: frames2, cleanup: cleanupFrames2 } = await extractVideoFramesWithFfmpeg(video2, outputDir2, 3, 'jpg', 200);
const chat = new ChatAboutVideo(
{
credential: {
key: process.env.OPENAI_API_KEY!,
},
storage: {
azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
storageContainerName: process.env.AZURE_STORAGE_CONTAINER_NAME || 'vision-experiment-input',
storagePathPrefix: 'video-frames/',
},
completionOptions: {
model: process.env.OPENAI_MODEL_NAME || 'gpt-4o',
},
},
consoleWithColour({ debug: process.env.ENABLE_DEBUG === 'true' }, chalk),
);
const conversation = (await chat.startConversation([
{
promptText: 'Frame images from sample 1:',
images: frames1.map((frame, i) => ({ imageFile: path.join(outputDir1, frame), promptText: `Frame CodeRed-${i + 1}` })),
},
{
promptText: 'Frame images from sample 2, also known as the "good example":',
images: frames2.map((frame) => ({ imageFile: path.join(outputDir2, frame) })),
},
])) as ConversationWithChatGpt;
...
}API
chat-about-video
Modules
Classes
Class: ChatAboutVideo<CLIENT, OPTIONS, PROMPT, RESPONSE>
chat.ChatAboutVideo
Type parameters
| Name | Type |
| :--------- | :--------------------------------------------------------------------------------------------- |
| CLIENT | any |
| OPTIONS | extends AdditionalCompletionOptions = any |
| PROMPT | any |
| RESPONSE | any |
Constructors
constructor
• new ChatAboutVideo<CLIENT, OPTIONS, PROMPT, RESPONSE>(options, log?)
Type parameters
| Name | Type |
| :--------- | :--------------------------------------------------------------------------------------------- |
| CLIENT | any |
| OPTIONS | extends AdditionalCompletionOptions = any |
| PROMPT | any |
| RESPONSE | any |
Parameters
| Name | Type |
| :-------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| options | SupportedChatApiOptions |
| log | undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> |
Properties
| Property | Description |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------- |
| Protected apiPromise: Promise<ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE>> | |
| Protected log: undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | |
| Protected options: SupportedChatApiOptions | |
Methods
getApi
▸ getApi(): Promise<ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Get the underlying API instance.
Returns
Promise<ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The underlying API instance.
startConversation
▸ startConversation(log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation without a video
Parameters
| Name | Type | Description |
| :----- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------- |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
▸ startConversation(options?, log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation without a video
Parameters
| Name | Type | Description |
| :--------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------- |
| options? | OPTIONS | Overriding options for this conversation |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
▸ startConversation(videoFile, log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation about a video.
Parameters
| Name | Type | Description |
| :---------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------- |
| videoFile | string | Path to a video file in local file system. |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
▸ startConversation(videoFile, options?, log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation about a video.
Parameters
| Name | Type | Description |
| :---------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------- |
| videoFile | string | Path to a video file in local file system. |
| options? | OPTIONS | Overriding options for this conversation |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
▸ startConversation(videos, log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation about a video.
Parameters
| Name | Type | Description |
| :------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| videos | (VideoInput | ImagesInput)[] | Array of videos or images to be used in the conversation. For each video, the video file path and the prompt before the video should be provided. For each group of images, the image file paths and the prompt before the image group should be provided. |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
▸ startConversation(videos, options?, log?): Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
Start a conversation about a video.
Parameters
| Name | Type | Description |
| :--------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| videos | (VideoInput | ImagesInput)[] | Array of videos or images to be used in the conversation. For each video, the video file path and the prompt before the video should be provided. For each group of images, the image file paths and the prompt before the image group should be provided. |
| options? | OPTIONS | Overriding options for this conversation |
| log? | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | Optional logger for this conversation, if not provided, the logger of ChatAboutVideo instance will be used. |
Returns
Promise<Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>>
The conversation.
Class: Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>
chat.Conversation
Type parameters
| Name | Type |
| :--------- | :--------------------------------------------------------------------------------------------- |
| CLIENT | any |
| OPTIONS | extends AdditionalCompletionOptions = any |
| PROMPT | any |
| RESPONSE | any |
Constructors
constructor
• new Conversation<CLIENT, OPTIONS, PROMPT, RESPONSE>(conversationId, api, prompt, options, cleanup?, log?)
Type parameters
| Name | Type |
| :--------- | :--------------------------------------------------------------------------------------------- |
| CLIENT | any |
| OPTIONS | extends AdditionalCompletionOptions = any |
| PROMPT | any |
| RESPONSE | any |
Parameters
| Name | Type |
| :--------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| conversationId | string |
| api | ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE> |
| prompt | undefined | PROMPT |
| options | OPTIONS |
| cleanup? | () => Promise<any> |
| log | undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> |
Properties
| Property | Description |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------- |
| Protected api: ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE> | |
| Protected Optional cleanup: () => Promise<any> | |
| Protected conversationId: string | |
| Protected log: undefined | LineLogger<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void> | |
| Protected options: OPTIONS | |
| Protected prompt: undefined | PROMPT | |
| Protected usage: undefined | UsageMetadata | |
Methods
end
▸ end(): Promise<void>
Returns
Promise<void>
getApi
▸ getApi(): ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE>
Get the underlying API instance.
Returns
ChatApi<CLIENT, OPTIONS, PROMPT, RESPONSE>
The underlying API instance.
getPrompt
▸ getPrompt(): undefined | PROMPT
Get the prompt for the current conversation. The prompt is the accumulated messages in the conversation so far.
Returns
undefined | PROMPT
The prompt which is the accumulated messages in the conversation so far.
getUsage
▸ getUsage(): undefined | UsageMetadata
Get usage statistics of the conversation.
Please note that the usage statistics would be undefined before the first say call.
It could also be undefined if the underlying API does not support usage statistics.
The usage statistics may not cover those failed requests due to content filtering or other reasons.
Therefore, it could be less than the billable usage.
Returns
undefined | UsageMetadata
The usage statistics of the conversation. Or undefined if not available.
progressConversation
▸ Protected progressConversation(updatedPrompt, effectiveOptions): Promise<undefined | string | ConversationResponse>
Parameters
| Name | Type |
| :----------------- | :-------- |
| updatedPrompt | PROMPT |
| effectiveOptions | OPTIONS |
Returns
Promise<undefined | string | ConversationResponse>
say
▸ say(message, options?): Promise<undefined | string | ConversationResponse>
Say something in the conversation, and get the response from AI
Parameters
| Name | Type | Description |
| :--------- | :--------------------- | :-------------------------------------- |
| message | string | The message to say in the conversation. |
| options? | Partial<OPTIONS> | Options for fine control. |
Returns
Promise<undefined | string | ConversationResponse>
The response/completion or tool calls.
submitToolCallResults
▸ submitToolCallResults(toolResults, options?): Promise<undefined | string | ConversationResponse>
Submit tool call results to the conversation, and get the response from AI.
Parameters
| Name | Type | Description |
| :------------ | :----------------------------------------------------- | :-------------------------- |
| toolResults | ToolCallResult[] | Array of tool call results. |
| options? | Partial<OPTIONS> | Options for fine control. |
Returns
Promise<undefined | string | ConversationResponse>
The response/completion or tool calls.
Class: ChatGptApi
chat-gpt.ChatGptApi
Implements
Constructors
constructor
• new ChatGptApi(options)
Parameters
| Name | Type |
| :-------- | :---------------------------------- |
| options | ChatGptOptions |
Properties
| Property | Description |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------- |
| Protected client: ChatGptClient | |
| Protected Optional extractVideoFrames: EffectiveExtractVideoFramesOptions | |
| Protected options: ChatGptOptions | |
| Protected Optional storage: Required<Pick<StorageOptions, "uploader">> & StorageOptions | |
| Protected tmpDir: string | |
Methods
appendToPrompt
▸ appendToPrompt(newPromptOrResponse, prompt?): Promise<ChatCompletionMessageParam[]>
Append a new prompt or response to the form a full prompt. This function is useful to build a prompt that contains conversation history.
Parameters
| Name | Type | Description |
| :-------------------- | :------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| newPromptOrResponse | ChatCompletionMessageParam[] | ChatCompletion | A new prompt to be appended, or previous response to be appended. |
| prompt? | ChatCompletionMessageParam[] | The conversation history which is a prompt containing previous prompts and responses. If it is not provided, the conversation history returned will contain only what is in newPromptOrResponse. |
Returns
Promise<ChatCompletionMessageParam[]>
The full prompt which is effectively the conversation history.
Implementation of
buildImagesPrompt
▸ buildImagesPrompt(imageInputs, conversationId?): Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
Build prompt for sending images content to AI. Sometimes, to include images in the conversation, additional options and/or clean up is needed. In such case, options to be passed to generateContent function and/or a clean up callback function can be returned from this function.
Parameters
| Name | Type | Description |
| :--------------- | :--------------------------------------------- | :------------------------------------- |
| imageInputs | ImageInput[] | Array of image inputs. |
| conversationId | string | Unique identifier of the conversation. |
Returns
Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
An object containing the prompt, optional options, and an optional cleanup function.
Implementation of
buildTextPrompt
▸ buildTextPrompt(text, _conversationId?): Promise<{ prompt: ChatCompletionMessageParam[] }>
Build prompt for sending text content to AI
Parameters
| Name | Type | Description |
| :----------------- | :------- | :------------------------------------- |
| text | string | The text content to be sent. |
| _conversationId? | string | Unique identifier of the conversation. |
Returns
Promise<{ prompt: ChatCompletionMessageParam[] }>
An object containing the prompt.
Implementation of
buildToolCallResultsPrompt
▸ buildToolCallResultsPrompt(toolResults, _conversationId?): Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
Build prompt for tool results.
Parameters
| Name | Type | Description |
| :----------------- | :----------------------------------------------------- | :------------------------------------- |
| toolResults | ToolCallResult[] | Array of tool call results. |
| _conversationId? | string | Unique identifier of the conversation. |
Returns
Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
An object containing the prompt.
Implementation of
ChatApi.buildToolCallResultsPrompt
buildVideoPrompt
▸ buildVideoPrompt(videoFile, conversationId?): Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
Build prompt for sending video content to AI. Sometimes, to include video in the conversation, additional options and/or clean up is needed. In such case, options to be passed to generateContent function and/or a clean up callback function can be returned from this function.
Parameters
| Name | Type | Description |
| :--------------- | :------- | :------------------------------------- |
| videoFile | string | Path to the video file. |
| conversationId | string | Unique identifier of the conversation. |
Returns
Promise<BuildPromptOutput<ChatCompletionMessageParam[], ChatGptCompletionOptions>>
An object containing the prompt, optional options, and an optional cleanup function.
Implementation of
generateContent
▸ generateContent(prompt, options): Promise<ChatCompletion>
Generate content based on the given prompt and options.
Parameters
| Name | Type | Description |
| :-------- | :------------------------------------------------------ | :-------------------------------------------------- |
| prompt | ChatCompletionMessageParam[] | The full prompt to generate content. |
| options | ChatGptCompletionOptions | Optional options to control the content generation. |
Returns
Promise<ChatCompletion>
The generated content.
Implementation of
getClient
▸ getClient(): Promise<ChatGptClient>
Get the raw client. This function could be useful for advanced use cases.
Returns
Promise<ChatGptClient>
The raw client.
Implementation of
getResponseText
▸ getResponseText(result): Promise<undefined | string>
Get the text from the response object
Parameters
| Name | Type | Description |
| :------- | :--------------- | :------------------ |
| result | ChatCompletion | the response object |
Returns
Promise<undefined | string>
Implementation of
getToolCalls
▸ getToolCalls(result): Promise<undefined | ToolCall[]>
Extract tool calls from the response object.
Parameters
| Name | Type | Description |
| :------- | :--------------- | :------------------ |
| result | ChatCompletion | the response object |
Returns
Promise<undefined | ToolCall[]>
Array of tool calls if tool calling is requested by AI, or undefined otherwise.
Implementation of
getUsageMetadata
▸ getUsageMetadata(result): Promise<undefined | UsageMetadata>
Extract usage metadata from the response object.
Parameters
| Name | Type | Description |
| :------- | :--------------- | :------------------ |
| result | ChatCompletion | the response object |
Returns
Promise<undefined | UsageMetadata>
Usage metadata from the response, if available. If the response does not contain usage metadata, it returns undefined.
Implementation of
isConnectivityError
▸ isConnectivityError(error): boolean
Check if the error is a connectivity error.
Parameters
| Name | Type | Description |
| :------ | :---- | :--------------- |
| error | any | any error object |
Returns
boolean
true if the error is a connectivity error, false otherwise.
Implementation of
isDownloadError
▸ isDownloadError(error): boolean
Check if the error is a temporary download error.
Parameters
| Name | Type | Description |
| :------ | :---- | :--------------- |
| error | any | any error object |
Returns
boolean
true if the error is a temporary connectivity error, false otherwise.
Implementation of
isServerError
▸ isServerError(error): boolean
Check if the error is a server error.
Parameters
| Name | Type | Description |
| :------ | :---- | :--------------- |
| error | any | any error object |
Returns
boolean
true if the error is a server error, false otherwise.
Implementation of
isThrottlingError
▸ isThrottlingError(error): boolean
Check if the error is a throttling error.
Parameters
| Name | Type | Description |
| :------ | :---- | :--------------- |
| error | any | any error object |
Returns
boolean
true if the error is a throttling error, false otherwise.
Implementation of
Class: GeminiApi
gemini.GeminiApi
Implements
Constructors
constructor
• new GeminiApi(options)
Parameters
| Name | Type |
| :-------- | :-------------------------------- |
| options | GeminiOptions |
Properties
| Property | Description |
| --------------------------------------------------------------------------------------------------------------- | ----------- |
| Protected client: GenerativeModel | |
| Protected extractVideoFrames: EffectiveExtractVideoFramesOptions | |
| Protected options: GeminiOptions | |
| Protected tmpDir: string
