@micdrop/server
v2.2.0
Published
ποΈπ€ Micdrop: Real-Time Voice Conversations with AI
Downloads
5,130
Maintainers
Readme
ποΈπ€ Micdrop: Real-Time Voice Conversations with AI
Micdrop website | Documentation | Demo
Micdrop is a set of open source Typescript packages to build real-time voice conversations with AI agents. It handles all the complexities on the browser and server side (microphone, speaker, VAD, network communication, etc) and provides ready-to-use implementations for various AI providers.
@micdrop/server
The Node.js server implementation of Micdrop.
For browser implementation, see @micdrop/client.
Features
- π€ AI agents integration
- ποΈ Speech-to-text and text-to-speech integration
- π Advanced audio processing:
- Streaming input and output
- Audio conversion
- Interruptions handling
- π¬ Conversation state management
- π WebSocket-based audio streaming
Installation
npm install @micdrop/serverQuick Start
import { MicdropServer } from '@micdrop/server'
import { OpenaiAgent } from '@micdrop/openai'
import { GladiaSTT } from '@micdrop/gladia'
import { ElevenLabsTTS } from '@micdrop/elevenlabs'
import { WebSocketServer } from 'ws'
const wss = new WebSocketServer({ port: 8081 })
wss.on('connection', (socket) => {
// Handle voice conversation
new MicdropServer(socket, {
firstMessage: 'How can I help you today?',
agent: new OpenaiAgent({
apiKey: process.env.OPENAI_API_KEY,
systemPrompt: 'You are a helpful assistant',
}),
stt: new GladiaSTT({
apiKey: process.env.GLADIA_API_KEY,
}),
tts: new ElevenLabsTTS({
apiKey: process.env.ELEVENLABS_API_KEY,
voiceId: process.env.ELEVENLABS_VOICE_ID,
}),
})
})Agent / STT / TTS
Micdrop server has 3 main components:
Agent- AI agent using LLMSTT- Speech-to-textTTS- Text-to-speech
Available implementations
Micdrop provides ready-to-use implementations for the following AI providers:
Custom implementations
You can use provided abstractions to write your own implementation:
- Agent - Abstract class for answer generation
- STT - Abstract class for speech-to-text
- TTS - Abstract class for text-to-speech
Documentation
Read full documentation of the Micdrop server on the website.
License
MIT
Author
Originally developed for Raconte.ai and open sourced by Lonestone (GitHub)
