qlabs-voice-ai-platform
v1.0.0
Published
qlabs-style Voice AI Platform API
Maintainers
Readme
Voice AI Platform API
A comprehensive Node.js API for an ElevenLabs-style Voice AI Platform with Text-to-Speech (TTS), Speech-to-Text (STT), voice cloning, workspace management, and billing functionality.
🚀 Features
- Authentication & User Management - JWT-based auth with role-based access control
- Multi-tenant Workspaces - Team collaboration with role-based permissions
- Voice Library & Cloning - Custom voice creation and management
- Text-to-Speech (TTS) - Generate speech from text using custom voices
- Speech-to-Text (STT) - Transcribe audio to text with batch processing
- Developer API - API key management for external integrations
- Analytics & Usage Tracking - Detailed usage analytics and reporting
- Billing & Plans - Subscription management with usage limits
- Real-time Notifications - User notification system
📁 Project Structure
voice-ai-platform/
├── config/
│ └── database.js # PostgreSQL connection config
├── controllers/
│ ├── authController.js # Authentication logic
│ ├── userMetadataController.js
│ ├── workspaceController.js
│ ├── voiceController.js
│ ├── ttsController.js
│ ├── sttController.js
│ ├── developerApiController.js
│ ├── analyticsController.js
│ └── billingController.js
├── middleware/
│ ├── auth.js # Authentication middleware
│ └── validation.js # Request validation
├── routes/
│ └── index.js # API route definitions
├── .env # Environment variables
├── package.json # Dependencies
├── server.js # Main application entry point
└── README.md # This file🛠️ Setup Instructions
Prerequisites
- Node.js 16+
- PostgreSQL 14+
- npm or yarn
1. Install Dependencies
npm install2. Database Setup
- Create a PostgreSQL database:
CREATE DATABASE voice_ai_platform;- Run the SQL schema from
tts.sqlfile to create all tables and seed data.
3. Environment Configuration
Copy and configure your .env file:
# Database Configuration
DB_HOST=localhost
DB_PORT=5432
DB_NAME=voice_ai_platform
DB_USER=your_db_user
DB_PASSWORD=your_db_password
# JWT Configuration
JWT_SECRET=your_super_secret_jwt_key_here_change_in_production
JWT_EXPIRES_IN=7d
# Server Configuration
PORT=3000
NODE_ENV=development
# CORS Configuration
CORS_ORIGIN=http://localhost:3000
# Rate Limiting
RATE_LIMIT_WINDOW_MS=900000
RATE_LIMIT_MAX_REQUESTS=100
# File Upload
MAX_FILE_SIZE=50mb
UPLOAD_PATH=./uploads
# External Services (Optional)
STRIPE_SECRET_KEY=sk_test_your_stripe_key
STRIPE_WEBHOOK_SECRET=whsec_your_webhook_secret4. Start the Server
Development mode:
npm run devProduction mode:
npm startThe API will be available at http://localhost:3000
📚 API Documentation
- Swagger UI:
http://localhost:3000/api-docs- Interactive API documentation - Health Check:
http://localhost:3000/health- Basic health status - Service Status:
http://localhost:3000/status- Detailed service status
📚 API Documentation
Interactive Documentation
- Swagger UI:
http://localhost:3000/api-docs- Complete interactive API documentation - OpenAPI Spec:
/docs/swagger.yaml- OpenAPI 3.0 specification - Postman Collection:
/docs/postman-collection.json- Ready-to-use Postman collection - Testing Guide:
/docs/API_TESTING_GUIDE.md- Comprehensive testing examples
Quick API Overview
All endpoints except public ones require JWT authentication via Authorization: Bearer <token> header.
API Groups
🔐 Auth & Users
POST /auth/signup- Register new user with workspacePOST /auth/login- Authenticate userGET /auth/me- Get current user info
🧠 User Metadata
POST /user/metadata- Store user metadata (onboarding, preferences)GET /user/metadata- Get all user metadataPUT /user/metadata/:key- Update specific metadataDELETE /user/metadata/:key- Delete metadata
🏢 Workspaces & Roles
GET /workspaces- List user workspacesPOST /workspaces- Create new workspacePOST /workspaces/:id/invite- Invite user to workspacePOST /workspaces/accept-invite/:token- Accept workspace invitationGET /workspaces/:id/members- Get workspace membersGET /roles- List available roles
🎙️ Voices
GET /voices- List available voicesPOST /voices- Create/clone new voiceGET /voices/:id- Get voice detailsPUT /voices/:id- Update voiceDELETE /voices/:id- Delete voiceGET /voices/:id/preview- Get voice preview
🗣️ Text-to-Speech (TTS)
POST /tts- Generate speech from textGET /tts/logs- Get TTS historyGET /tts/:id- Get TTS detailsDELETE /tts/:id- Delete TTS record
🎧 Speech-to-Text (STT)
POST /stt- Transcribe audio fileGET /stt/logs- Get STT historyGET /stt/:id- Get STT detailsDELETE /stt/:id- Delete STT recordPOST /stt/batch- Batch transcriptionGET /stt/batch/:batchId- Check batch status
🧑💻 Developer API
GET /developer/api- List API keysPOST /developer/api- Create new API keyPUT /developer/api/:id- Update API keyDELETE /developer/api/:id- Delete API keyPOST /developer/api/:id/regenerate- Regenerate API keyGET /developer/api/:id/usage- Get API key usage
📊 Analytics & Notifications
GET /analytics/usage- Get usage analyticsGET /analytics/voices- Get voice usage analyticsGET /analytics/dashboard- Get dashboard dataGET /notifications- List notificationsPUT /notifications/:id/read- Mark notification as readPUT /notifications/read-all- Mark all notifications as read
💳 Billing & Plans
GET /plans- List available plans (public)GET /billing/current-plan- Get current subscriptionPOST /billing/subscribe- Subscribe to planGET /billing/invoices- List invoicesGET /billing/invoices/:id- Get invoice detailsPOST /billing/cancel-subscription- Cancel subscriptionGET /billing/usage-overage- Check usage overage
🔑 API Key Endpoints (External Access)
POST /api/v1/tts- TTS via API keyPOST /api/v1/stt- STT via API keyGET /api/v1/voices- List voices via API key
Example Requests
Sign Up
curl -X POST http://localhost:3000/auth/signup \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"password": "password123",
"name": "John Doe",
"workspace_name": "John'\''s Studio"
}'Generate Speech
curl -X POST http://localhost:3000/tts \
-H "Authorization: Bearer <jwt_token>" \
-H "Content-Type: application/json" \
-d '{
"voice_id": "voice-uuid",
"text": "Hello, this is a test message!",
"format": "mp3"
}'Transcribe Audio
curl -X POST http://localhost:3000/stt \
-H "Authorization: Bearer <jwt_token>" \
-H "Content-Type: application/json" \
-d '{
"file_url": "https://example.com/audio.wav",
"language": "en"
}'🔒 Security Features
- JWT Authentication - Secure token-based authentication
- Rate Limiting - Prevent API abuse with configurable limits
- CORS Protection - Cross-origin request security
- Helmet Security - Additional security headers
- Input Validation - Joi-based request validation
- SQL Injection Protection - Parameterized queries
- Role-based Access Control - Workspace-level permissions
🏗️ Architecture
- Node.js/Express - Web framework
- PostgreSQL - Primary database with JSONB support
- JWT - Authentication tokens
- Joi - Request validation
- bcrypt - Password hashing
- uuid - Unique identifiers
📈 Monitoring & Health
GET /health- Basic health checkGET /status- Detailed service status
🚀 Production Deployment
- Set
NODE_ENV=production - Use proper SSL certificates
- Configure production database
- Set up monitoring (PM2, Docker, etc.)
- Configure load balancing
- Set up backup systems
- Configure log aggregation
📝 TODOs & Integrations
- [ ] Integrate with ElevenLabs API for TTS
- [ ] Integrate with OpenAI Whisper for STT
- [ ] Add Stripe payment processing
- [ ] Implement email notifications
- [ ] Add file upload handling
- [ ] Add audio processing pipeline
- [ ] Implement WebSocket for real-time updates
- [ ] Add comprehensive testing suite
- [ ] Add API documentation with Swagger
- [ ] Implement caching layer (Redis)
