real-time-speech-analyzer

v1.0.0

Published

8 months ago

Real-time speech analysis with local LLM using multiple concurrent analysis instructions

Downloads

0High
0Medium
0Low

mhilde

speech analysis llm ollama real-time transcription ai sentiment-analysis voice speech-to-text

Real-Time Speech Analyzer

🎙️ Real-time speech analysis with local LLM using multiple concurrent analysis instructions

Analyze speech in real-time using Web Speech API and local Ollama models. Run multiple analysis instructions simultaneously on each speech segment with live processing feedback.

✨ Features

Real-time speech transcription with live word-by-word display
Multiple analysis instructions running concurrently on each speech chunk
Conversation context management with rolling summaries
Local LLM integration via Ollama (privacy-focused, no cloud required)
Processing notifications with live feedback
Customizable analysis prompts with persistent settings
Cross-platform support (Windows, macOS, Linux)

🚀 Quick Start

Prerequisites

Install Ollama and pull a model:

# Install Ollama: https://ollama.ai/
ollama pull llama3.1:8b
ollama serve

Use Chrome or Edge browser (required for Web Speech API)

Installation

# Install globally
npm install -g real-time-speech-analyzer

# Start the analyzer
speech-analyzer

Or run locally:

# Clone and run
git clone https://github.com/speech-analyzer/speech-analyzer
cd speech-analyzer
npm install
npm start

Usage

Start the server - Opens at http://localhost:3000
Configure analysis instructions - Default includes:
- Sentiment analysis (5 keywords)
- Controversial statement detection with counterarguments
Click "Start Listening" and begin speaking
View real-time results in the AI Analysis column

🎯 Default Analysis Instructions

"Analyse sentiment, show as 5 keywords, nothing else."
"Identify most controversial statement and respond with a counterargument."

🔧 Configuration Options

Analysis Instructions

Add multiple instructions with the + Add button
Update All - Apply changes to all fields
Default All - Reset to default instruction set
Remove fields with ✖ button (when multiple exist)

Context Management

Summary threshold - Create summary after N chunks (default: 20)
Recent chunks - Keep N recent chunks in context (default: 10)
Force Summary - Manually trigger summarization

System Prompt

Customize the AI's behavior and response style
Default includes 25-word response limit
Update and Default buttons for easy management

🖥️ Cross-Platform Support

Windows

npm install -g real-time-speech-analyzer
speech-analyzer

macOS/Linux

npm install -g real-time-speech-analyzer
speech-analyzer

Runtime Requirements

Node.js 18+ (built-in support)
Bun (optional, for faster performance)
Ollama running locally
Chrome/Edge browser

🏗️ Architecture

Frontend: TypeScript + DOM manipulation
Backend: Bun server with WebSocket communication
LLM: Local Ollama integration
Speech: Web Speech API with interim results
Context: Rolling summary system for conversation memory

📊 Processing Flow

Speech captured → Web Speech API
Live transcription → Shows words as spoken
Final chunks → Sent to analysis queue
Processing notification → Yellow indicator with spinner
Multiple LLM calls → One per analysis instruction
Results displayed → Separate boxes per instruction
Context updated → Rolling summary system

🔒 Privacy

100% local processing - No cloud APIs
Your speech never leaves your machine
Local Ollama models for AI analysis
No data collection or tracking

⚙️ Development

# Development mode with hot reload
npm run dev

# Build for production
npm run build

# Start production build
npm start

Project Structure

speech-analyzer/
├── server.ts          # Bun WebSocket server
├── app.ts             # Frontend TypeScript
├── index.html         # Main UI
├── bin/               # Executable scripts
└── dist/              # Built files

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Test cross-platform compatibility
Submit a pull request

📝 License

MIT License - see LICENSE file for details

🐛 Troubleshooting

"Speech recognition not supported"

Use Chrome or Edge browser
Ensure HTTPS (or localhost)

"Ollama: Unavailable"

Start Ollama: ollama serve
Pull a model: ollama pull llama3.1:8b
Check Ollama is running on port 11434

"No analysis appearing"

Check browser console for errors
Verify Ollama model is loaded
Ensure analysis instructions are configured

Performance Issues

Use lighter Ollama models (e.g., llama3.1:8b vs llama3.1:70b)
Reduce context window settings
Close other browser tabs

🔗 Links

GitHub: https://github.com/speech-analyzer/speech-analyzer
npm: https://npmjs.com/package/real-time-speech-analyzer
Ollama: https://ollama.ai/
Bun: https://bun.sh/

Made with ❤️ for real-time speech analysis