icode-stt-google

v1.1.0

Published

2 months ago

Icode implementation of speech-to-text using Google Cloud

0High
0Medium
0Low

v4nz777

icode-stt-google — Setup Guide

Prerequisites

Node.js 14+
A Google Cloud account (free tier works — includes 60 minutes/month of STT)
npm install completed in this directory

Step 1: Create a GCP Project

Go to Google Cloud Console
Click the project dropdown at the top bar → New Project
Enter a project name (e.g. icode-stt) and click Create
Note your Project ID (shown below the name field — it may differ from the name)

If you already have a project, skip this step.

Step 2: Enable the Speech-to-Text API

In the Cloud Console, go to APIs & Services → Library
- Direct link: https://console.cloud.google.com/apis/library
Search for "Cloud Speech-to-Text API"
Click on it and press Enable

Make sure you're enabling the API in the correct project (check the project selector at the top).

Step 3: Create a Service Account + Key File

Go to IAM & Admin → Service Accounts
- Direct link: https://console.cloud.google.com/iam-admin/serviceaccounts
Click + Create Service Account
- Name: stt-client (or any name)
- Click Create and Continue
Grant the role: Cloud Speech Client (roles/speech.client)
- If you can't find it, use Cloud Speech Editor or Owner for testing
- Click Continue → Done
Click on the newly created service account
Go to the Keys tab → Add Key → Create New Key
Choose JSON → Create
A .json file will download — move it to this project directory:

google/
  service-account.json   ← place it here
  index.js
  lib/
  ...

Important: service-account.json is already covered by .gitignore patterns. Double-check it won't be committed by running git status before any commit.

Step 4: Add service account key to .gitignore

Add this line to .gitignore if not already present:

*.json
!package.json
!package-lock.json

Or more specifically:

service-account.json

Step 5: Get a Sample Audio File

For testing, you need an audio file. Options:

Option A — Record one yourself:

Use any voice recorder app, save as .wav or .flac
Keep it under 60 seconds for recognize() testing

Option B — Download a sample:

Google provides public samples in GCS:

gs://cloud-samples-data/speech/brooklyn_bridge.flac

To download locally:

curl -o sample.flac https://storage.googleapis.com/cloud-samples-data/speech/brooklyn_bridge.flac

Step 6: Test It

Create a file called test-run.js:

const { GoogleSTT } = require('.');

async function main() {
  const stt = new GoogleSTT({
    projectId: 'YOUR_PROJECT_ID',       // from Step 1
    credentials: './service-account.json', // from Step 3
  });

  // Test recognize with a local file
  const result = await stt.recognize('./sample.flac');
  console.log('Transcript:', result.transcript);
  console.log('Results:', result.results);

  await stt.close();
}

main().catch(console.error);

Run it:

node test-run.js

Expected output:

Transcript: how old is the Brooklyn Bridge
Results: [ { transcript: 'how old is the Brooklyn Bridge', confidence: 0.98 } ]

Troubleshooting

| Error | Cause | Fix | |-------|-------|-----| | STTError: VALIDATION projectId is required | Missing projectId in constructor | Pass projectId: 'your-project-id' | | STTError: AUTH_FAILED (code 7 or 16) | Bad credentials or API not enabled | Check service account key path and that the API is enabled | | STTError: QUOTA_EXCEEDED (code 8) | Exceeded free tier or billing not set up | Enable billing on the project or wait for quota reset | | STTError: INVALID_AUDIO (code 3) | Corrupt or unsupported audio format | Try a different file; .wav, .flac, .mp3 all work with autoDecodingConfig | | Could not load the default credentials | No credentials found | Set credentials option or GOOGLE_APPLICATION_CREDENTIALS env var |

Alternative: Use Environment Variable Instead of credentials Option

Instead of passing the key file path in code, you can set:

export GOOGLE_APPLICATION_CREDENTIALS="./service-account.json"

Then omit the credentials option:

const stt = new GoogleSTT({ projectId: 'YOUR_PROJECT_ID' });

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme