npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, πŸ‘‹, I’m Ryan HefnerΒ  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you πŸ™

Β© 2026 – Pkg Stats / Ryan Hefner

mbz-voice-sdk

v1.0.21

Published

πŸŽ™οΈ MBZ Voice SDK: Easily add voice recognition, Gemini-based AI replies, and TTS to any web app.

Readme

πŸŽ™οΈ MBZ Voice SDK

Speak. Think. Respond. Seamlessly.

MBZ-Voice-SDK is a powerful developer tool that enables you to integrate voice input, AI understanding (via Gemini), and spoken responses into any modern web app. Whether you're building a chatbot, AI assistant, or a voice-powered UI β€” this SDK makes it plug-and-play.


πŸ“‹ Table of Contents


πŸ”₯ Features

βœ… Voice Input: Capture user speech via browser microphone using Web Speech API
βœ… AI Processing: Gemini-powered AI backend built with FastAPI
βœ… Voice Response: Convert AI text responses to spoken words using Web Speech TTS
βœ… Audio Controls: Easily toggle mute/unmute functionality
βœ… Conversation Memory: Store the last 3 Q&A exchanges using localStorage
βœ… Framework Agnostic: Seamlessly integrate with plain JavaScript, React, Vue, or any modern frontend framework
βœ… Customizable: Configure language, voice type, and response behavior
βœ… Lightweight: Minimal dependencies for optimal performance

πŸ’» Requirements

  • Modern web browser with support for:
    • Web Speech API (SpeechRecognition)
    • Web Speech API (SpeechSynthesis)
    • localStorage
  • Node.js 14+ (for development)
  • Python 3.8+ (for backend)
  • Gemini API key from Google AI Studio

πŸ“¦ Install the SDK

NPM Installation

After publishing on npm:

npx mbz-voice-sdk init
### Creating a Comprehensive README.md File

Here's an enhanced README.md file with more complete details for the MBZ Voice SDK:

```markdown
...

Yarn Installation

yarn add mbz-voice-sdk

Local Installation (if cloned)

cd mbz-voice-sdk/sdk
npm install

CDN Usage

<script src="https://unpkg.com/mbz-voice-sdk@latest/dist/mbz-voice-sdk.min.js"></script>

βš™οΈ Backend Setup Guide

This SDK requires a backend API endpoint connected to Gemini (Google AI). We've provided a ready-to-use FastAPI backend in the /backend folder.

1️⃣ Navigate to the backend directory

cd ../backend

2️⃣ Install Python dependencies

pip install -r requirements.txt

3️⃣ Add Your Gemini API Key

Create a .env file in the backend folder and paste your Gemini API key:

GEMINI_API_KEY=your_google_gemini_api_key_here

πŸ‘‰ Get your key from: https://makersuite.google.com/app/apikey

4️⃣ Run the server

uvicorn main:app --reload

Now your backend is live at:

http://localhost:8000/ask

🧠 SDK Usage Example

Basic Usage

import { MBZVoiceAgent } from "mbz-voice-sdk";

const agent = new MBZVoiceAgent({
  apiUrl: "http://localhost:8000/ask",
  lang: "en-US",
  speak: true
});

agent.onTranscript((text) => {
  console.log("User said:", text);
});

agent.onResponse((reply) => {
  console.log("AI replied:", reply);
});

document.getElementById("start-btn").onclick = () => agent.listen();

React Integration

import React, { useEffect, useState } from 'react';
import { MBZVoiceAgent } from 'mbz-voice-sdk';

function VoiceAssistant() {
  const [transcript, setTranscript] = useState('');
  const [response, setResponse] = useState('');
  const [isListening, setIsListening] = useState(false);
  const [agent, setAgent] = useState(null);

  useEffect(() => {
    // Initialize the agent
    const voiceAgent = new MBZVoiceAgent({
      apiUrl: "http://localhost:8000/ask",
      lang: "en-US",
      speak: true
    });

    // Set up event handlers
    voiceAgent.onTranscript((text) => {
      setTranscript(text);
    });

    voiceAgent.onResponse((reply) => {
      setResponse(reply);
    });

    voiceAgent.onListeningChange((listening) => {
      setIsListening(listening);
    });

    setAgent(voiceAgent);

    // Cleanup on unmount
    return () => {
      voiceAgent.cleanup();
    };
  }, []);

  const handleListen = () => {
    if (agent) {
      agent.listen();
    }
  };

  return (
    <div className="voice-assistant">
      <button 
        onClick={handleListen}
        className={isListening ? 'listening' : ''}
      >
        {isListening ? 'πŸ”΄ Listening...' : 'πŸŽ™οΈ Start Talking'}
      </button>
      
      {transcript && (
        <div className="transcript">
          <h3>You said:</h3>
          <p>{transcript}</p>
        </div>
      )}
      
      {response && (
        <div className="response">
          <h3>AI response:</h3>
          <p>{response}</p>
        </div>
      )}
    </div>
  );
}

export default VoiceAssistant;

πŸ§ͺ HTML Quick Test

<button id="start-btn">πŸŽ™οΈ Start Talking</button>
<div id="transcript"></div>
<div id="response"></div>

<script type="module">
  import { MBZVoiceAgent } from 'mbz-voice-sdk';

  const agent = new MBZVoiceAgent({ 
    apiUrl: 'http://localhost:8000/ask',
    speak: true
  });

  const transcriptEl = document.getElementById('transcript');
  const responseEl = document.getElementById('response');

  agent.onTranscript(text => {
    console.log("🎀", text);
    transcriptEl.textContent = `You said: ${text}`;
  });
  
  agent.onResponse(reply => {
    console.log("πŸ€–", reply);
    responseEl.textContent = `AI says: ${reply}`;
  });

  document.getElementById("start-btn").onclick = () => agent.listen();
</script>

πŸ“š API Documentation

MBZVoiceAgent Class

The main class for interacting with the SDK.

Constructor

const agent = new MBZVoiceAgent(options);

Options

| Option | Type | Default | Description |-----|-----|-----|----- | apiUrl | String | Required | The URL of your backend API endpoint | lang | String | 'en-US' | The language for speech recognition | speak | Boolean | true | Whether to speak the AI's response | voiceIndex | Number | 0 | Index of the voice to use for speech synthesis | pitch | Number | 1.0 | The pitch of the voice (0.1 to 2.0) | rate | Number | 1.0 | The speed of the voice (0.1 to 10.0) | volume | Number | 1.0 | The volume of the voice (0.0 to 1.0) | maxHistory | Number | 3 | Maximum number of Q&A pairs to store in history

Methods

| Method | Parameters | Description |-----|-----|-----|----- | listen() | None | Start listening for voice input | stop() | None | Stop listening for voice input | mute() | None | Mute the voice response | unmute() | None | Unmute the voice response | cleanup() | None | Clean up resources and event listeners | onTranscript(callback) | Function | Set callback for transcript events | onResponse(callback) | Function | Set callback for AI response events | onListeningChange(callback) | Function | Set callback for listening state changes | onError(callback) | Function | Set callback for error events | getHistory() | None | Get the conversation history | clearHistory() | None | Clear the conversation history

πŸ”§ Troubleshooting

Microphone Not Working

  • Ensure your browser has permission to access the microphone
  • Check if your microphone is properly connected and working
  • Try using a different browser (Chrome and Edge have the best support)

Speech Recognition Not Starting

  • Make sure you're using a supported browser (Chrome, Edge, Safari)
  • Check your internet connection
  • Verify that your site is served over HTTPS (required for production)

Backend Connection Issues

  • Confirm your backend server is running
  • Check for CORS issues (the backend should allow requests from your frontend)
  • Verify your API URL is correct in the SDK initialization

Voice Response Not Working

  • Check if your device's volume is turned on
  • Make sure the speak option is set to true
  • Try using a different voice by changing the voiceIndex

🀝 Contributing

Contributions are welcome! Here's how you can help:

  1. Fork the repository
  2. Create a feature branch:
git checkout -b feature/amazing-feature
  1. Commit your changes:
git commit -m 'Add some amazing feature'
  1. Push to the branch:
git push origin feature/amazing-feature
  1. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/ProMBZ/mbz-voice-sdk.git

# Install dependencies
cd mbz-voice-sdk
npm install

# Run development server
npm run dev

# Build for production
npm run build

πŸ” Security Notice

This SDK does not use any built-in Gemini key.

πŸ” You are responsible for adding your own Gemini key to the backend.

Never include your Gemini key in frontend code.

🧰 Tools Used

  • Frontend:

  • JavaScript (SpeechRecognition + TTS APIs)

  • localStorage for conversation persistence

  • Rollup for bundling

  • Backend:

  • FastAPI (Python)

  • Google Generative AI SDK (Gemini 1.5 Flash)

  • Python-dotenv for environment variables

πŸ“„ License

MIT Β© 2025 β€” Developed by Muhammad (MBZ-Voice-SDK)πŸ”— GitHub: @ProMBZ

πŸ’¬ Support

If you have questions, suggestions, or want to collaborate:πŸ“§ Email: [email protected]🌍 Portfolio: https://kzml8bqhnxp4cn0duf08.lite.vusercontent.net/


Made with ❀️ by Muhammad


This comprehensive README.md file includes all the essential details about the MBZ Voice SDK, including installation instructions, usage examples, API documentation, troubleshooting tips, and contribution guidelines. It's well-structured with clear sections and formatting to make it easy to navigate and understand.



<Actions>
  <Action name="Create a demo implementation" description="Build a simple demo app using the MBZ Voice SDK" />
  <Action name="Add code examples for Vue.js" description="Add specific code examples for Vue.js integration" />
  <Action name="Create backend API documentation" description="Generate detailed API documentation for the backend endpoints" />
  <Action name="Add deployment instructions" description="Create a guide for deploying the backend to production" />
  <Action name="Create a video tutorial" description="Outline steps for creating a video tutorial for the SDK" />
</Actions>