npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

assemblyai

v4.4.3

Published

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, as well as the latest LeMUR models.

Downloads

91,802

Readme


npm Test GitHub License AssemblyAI Twitter AssemblyAI YouTube Discord

AssemblyAI JavaScript SDK

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, as well as the latest LeMUR models. It is written primarily for Node.js in TypeScript with all types exported, but also compatible with other runtimes.

Documentation

Visit the AssemblyAI documentation for step-by-step instructions and a lot more details about our AI models and API. Explore the SDK API reference for more details on the SDK types, functions, and classes.

Quickstart

Install the AssemblyAI SDK using your preferred package manager:

npm install assemblyai
yarn add assemblyai
pnpm add assemblyai
bun add assemblyai

Then, import the assemblyai module and create an AssemblyAI object with your API key:

import { AssemblyAI } from "assemblyai";

const client = new AssemblyAI({
  apiKey: process.env.ASSEMBLYAI_API_KEY,
});

You can now use the client object to interact with the AssemblyAI API.

Using a CDN

You can use automatic CDNs like UNPKG to load the library from a script tag.

  • Replace :version with the desired version or latest.
  • Remove .min to load the non-minified version.
<script src="https://www.unpkg.com/assemblyai@:version/dist/assemblyai.umd.min.js"></script>

The script creates a global assemblyai variable containing all the services. Here's how you create a RealtimeTranscriber object.

const { RealtimeTranscriber } = assemblyai;
const transcriber = new RealtimeTranscriber({
  token: "[GENERATE TEMPORARY AUTH TOKEN IN YOUR API]",
  ...
});

For type support in your IDE, see Reference types from JavaScript.

Speech-To-Text

Transcribe audio and video files

When you create a transcript, you can either pass in a URL to an audio file or upload a file directly.

// Transcribe file at remote URL
let transcript = await client.transcripts.transcribe({
  audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});

Note You can also pass a local file path, a stream, or a buffer as the audio property.

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

let transcript = await client.transcripts.submit({
  audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
});

When you create a transcript, you can either pass in a URL to an audio file or upload a file directly.

// Upload a file via local path and transcribe
let transcript = await client.transcripts.transcribe({
  audio: "./news.mp4",
});

Note: You can also pass a file URL, a stream, or a buffer as the audio property.

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

let transcript = await client.transcripts.submit({
  audio: "./news.mp4",
});

You can extract even more insights from the audio by enabling any of our AI models using transcription options. For example, here's how to enable Speaker diarization model to detect who said what.

let transcript = await client.transcripts.transcribe({
  audio: "https://storage.googleapis.com/aai-web-samples/espn-bears.m4a",
  speaker_labels: true,
});
for (let utterance of transcript.utterances) {
  console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
}

This will return the transcript object in its current state. If the transcript is still processing, the status field will be queued or processing. Once the transcript is complete, the status field will be completed.

const transcript = await client.transcripts.get(transcript.id);

If you created a transcript using .submit(), you can still poll until the transcript status is completed or error using .waitUntilReady():

const transcript = await client.transcripts.waitUntilReady(transcript.id, {
  // How frequently the transcript is polled in ms. Defaults to 3000.
  pollingInterval: 1000,
  // How long to wait in ms until the "Polling timeout" error is thrown. Defaults to infinite (-1).
  pollingTimeout: 5000,
});
const sentences = await client.transcripts.sentences(transcript.id);
const paragraphs = await client.transcripts.paragraphs(transcript.id);
const charsPerCaption = 32;
let srt = await client.transcripts.subtitles(transcript.id, "srt");
srt = await client.transcripts.subtitles(transcript.id, "srt", charsPerCaption);

let vtt = await client.transcripts.subtitles(transcript.id, "vtt");
vtt = await client.transcripts.subtitles(transcript.id, "vtt", charsPerCaption);

This will return a page of transcripts you created.

const page = await client.transcripts.list();

You can also paginate over all pages.

let previousPageUrl: string | null = null;
do {
  const page = await client.transcripts.list(previousPageUrl);
  previousPageUrl = page.page_details.prev_url;
} while (previousPageUrl !== null);

[!NOTE] To paginate over all pages, you need to use the page.page_details.prev_url because the transcripts are returned in descending order by creation date and time. The first page is are the most recent transcript, and each "previous" page are older transcripts.

const res = await client.transcripts.delete(transcript.id);

Transcribe in real-time

Create the real-time transcriber.

const rt = client.realtime.transcriber();

You can also pass in the following options.

const rt = client.realtime.transcriber({
  realtimeUrl: 'wss://localhost/override',
  apiKey: process.env.ASSEMBLYAI_API_KEY // The API key passed to `AssemblyAI` will be used by default,
  sampleRate: 16_000,
  wordBoost: ['foo', 'bar']
});

[!WARNING] Storing your API key in client-facing applications exposes your API key. Generate a temporary auth token on the server and pass it to your client. Server code:

const token = await client.realtime.createTemporaryToken({ expires_in = 60 });
// TODO: return token to client

Client code:

import { RealtimeTranscriber } from "assemblyai";
// TODO: implement getToken to retrieve token from server
const token = await getToken();
const rt = new RealtimeTranscriber({
  token,
});

You can configure the following events.

rt.on("open", ({ sessionId, expiresAt }) => console.log('Session ID:', sessionId, 'Expires at:', expiresAt));
rt.on("close", (code: number, reason: string) => console.log('Closed', code, reason));
rt.on("transcript", (transcript: TranscriptMessage) => console.log('Transcript:', transcript));
rt.on("transcript.partial", (transcript: PartialTranscriptMessage) => console.log('Partial transcript:', transcript));
rt.on("transcript.final", (transcript: FinalTranscriptMessage) => console.log('Final transcript:', transcript));
rt.on("error", (error: Error) => console.error('Error', error));

After configuring your events, connect to the server.

await rt.connect();

Send audio data via chunks.

// Pseudo code for getting audio
getAudio((chunk) => {
  rt.sendAudio(chunk);
});

Or send audio data via a stream by piping to the real-time stream.

audioStream.pipeTo(rt.stream());

Close the connection when you're finished.

await rt.close();

Apply LLMs to your audio with LeMUR

Call LeMUR endpoints to apply LLMs to your transcript.

const { response } = await client.lemur.task({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  prompt: "Write a haiku about this conversation.",
});
const { response } = await client.lemur.summary({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  answer_format: "one sentence",
  context: {
    speakers: ["Alex", "Bob"],
  },
});
const { response } = await client.lemur.questionAnswer({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
  questions: [
    {
      question: "What are they discussing?",
      answer_format: "text",
    },
  ],
});
const { response } = await client.lemur.actionItems({
  transcript_ids: ["0d295578-8c75-421a-885a-2c487f188927"],
});
const response = await client.lemur.purgeRequestData(lemurResponse.request_id);

Contributing

If you want to contribute to the JavaScript SDK, follow the guidelines in CONTRIBUTING.md.