@yuchida-tamu/podcast-gen
v1.0.7
Published
AI-Powered Monologue Podcast Generator
Maintainers
Readme
PodcastGen 🎙️
AI-Powered Monologue Podcast Generator
PodcastGen is a command-line application that generates natural-sounding podcast episodes featuring an AI narrator delivering engaging monologues. Users input a topic, and the system produces both a concatenated audio file and a structured JSON script of a thoughtful, balanced exploration following a narrative structure.
🌟 Features
- Natural Monologue Generation: Single AI narrator delivers engaging, balanced explorations
- Narrative Structure: Follows introduction-exploration-conclusion flow
- JSON Output Format: Generates structured JSON scripts with timestamps and metadata
- Audio Concatenation: Automatically combines audio segments into a single MP3 file
- Flexible Duration: Support for 5 or 10-minute podcast episodes
- CLI Interface: Simple command-line interface with progress indicators
- Error Handling: Robust validation and helpful error messages
🚀 Quick Start
Prerequisites
- Node.js 18+ installed on your system
- npm package manager
- OpenAI API key (get one at platform.openai.com)
API Key Setup
You can provide API keys in two ways:
Via CLI flag (recommended for quick usage):
npx podcast-gen "topic" --openai-key sk-your-key-hereVia environment variables (recommended for regular usage):
export OPENAI_API_KEY=sk-your-key-hereOr create a
.envfile:OPENAI_API_KEY=sk-your-key-here
Installation
For End Users (Recommended)
No installation required! Just use npx:
npx podcast-gen "Your topic here" --openai-key sk-your-key-hereFor Development
Clone the repository:
git clone <repository-url> cd podcast-genInstall dependencies:
npm installSet up environment variables:
cp .env.example .env # Edit .env and add your API keysBuild the project:
npm run build
Basic Usage
Using npx (Recommended)
Option 1: Provide API key via flag
npx podcast-gen "Is universal basic income feasible?" --openai-key sk-your-key-hereOption 2: Use environment variables
export OPENAI_API_KEY=sk-your-key-here
npx podcast-gen "Is universal basic income feasible?"Generate a 10-minute podcast with custom output directory:
npx podcast-gen "Climate change solutions" --duration 10 --output ./my-podcasts --openai-key sk-your-key-hereUse existing script:
npx podcast-gen "" --script ./path/to/script.json --openai-key sk-your-key-hereLocal Development
For local development, you can also use:
npm run dev "Topic" --openai-key sk-your-key-hereOr set environment variables:
export OPENAI_API_KEY=sk-your-key-here
npm run dev "Topic"Command Options
npx podcast-gen <topic> [options]
Arguments:
topic Topic for the monologue podcast
Options:
-d, --duration <minutes> Duration in minutes (5 or 10) (default: "5")
-o, --output <path> Output directory (default: "./output")
-s, --script <path> Use existing script file instead of generating new content
--openai-key <key> OpenAI API key (alternatively set OPENAI_API_KEY env var)
-h, --help Display help for command📁 Output Files
The generator creates multiple files for each podcast:
- Script:
topic-slug_YYYY-MM-DD.json- Structured JSON script with timestamps and metadata - Audio:
topic-slug_YYYY-MM-DD.mp3- Final concatenated MP3 audio file - Segments:
topic-slug_YYYY-MM-DD_segment_001.mp3- Individual audio segments
Example Output Structure
output/
├── is-universal-basic-income-feasible_2025-07-07.json
├── is-universal-basic-income-feasible_2025-07-07.mp3
├── is-universal-basic-income-feasible_2025-07-07_segment_001.mp3
├── is-universal-basic-income-feasible_2025-07-07_segment_002.mp3
├── climate-change-solutions_2025-07-07.json
├── climate-change-solutions_2025-07-07.mp3
├── climate-change-solutions_2025-07-07_segment_001.mp3
└── climate-change-solutions_2025-07-07_segment_002.mp3📋 Script Format
Generated JSON scripts follow this structure:
{
"title": "Is Universal Basic Income Feasible?",
"generated": "2025-07-07T14:30:00Z",
"duration": 300,
"segments": [
{
"text": "You know, I've been thinking about this topic...",
"timestamp": "00:00",
"duration": 15,
"emotion": "thoughtful"
}
],
"metadata": {
"topic": "Is Universal Basic Income Feasible?",
"totalSegments": 8,
"estimatedDuration": 300,
"format": "monologue",
"version": "1.0.0"
}
}🛠️ Development
Project Structure
src/
├── cli.js # Main CLI entry point
├── orchestrator.ts # Main podcast generation workflow
├── monologue/ # Monologue generation logic
│ └── engine.ts # Core monologue generation
├── llm/ # Language model services
│ ├── OpenAIService.ts # OpenAI integration
│ └── APIClient.ts # Base API client with retry logic
├── script/ # Script formatting
│ └── formatter.ts # JSON formatting with timestamps
├── audio/ # Audio synthesis and processing
│ ├── synthesizer.ts # OpenAI TTS integration
│ └── dataTransformer.ts # MP3 concatenation and processing
├── types/ # TypeScript type definitions
│ └── index.ts # Shared types
└── utils/ # Shared utilities
├── progress.ts # Progress indicators
└── errors.ts # Error handlingEnvironment Setup
Copy the environment template:
cp .env.example .envAdd your API keys:
OPENAI_API_KEY=your_openai_key_here
Available Scripts
npm run dev # Run in development mode
npm run build # Build the project
npm run start # Run the built application
npm test # Run all tests
npm run test:watch # Run tests in watch mode
npm run lint # Run ESLint
npm run lint:fix # Fix ESLint issues🧪 Testing Locally
Manual Testing
Basic functionality test (with API key flag):
npx podcast-gen "Should AI replace human creativity?" --openai-key sk-your-keyUsing environment variables:
export OPENAI_API_KEY=sk-your-key npx podcast-gen "Should AI replace human creativity?"Duration options test:
npx podcast-gen "The future of work" --duration 10 --openai-key sk-your-keyCustom output directory test:
npx podcast-gen "Space exploration ethics" --output ./test-output --openai-key sk-your-keyScript file test:
npx podcast-gen "" --script ./output/existing-script.json --openai-key sk-your-keyError handling tests:
# Too short topic npx podcast-gen "AI" --openai-key sk-your-key # Invalid duration npx podcast-gen "Philosophy of mind" --duration 7 --openai-key sk-your-key # Missing API key npx podcast-gen "Some topic"
Expected Behavior
- ✅ Successful generation should show progress steps (1/5 → 5/5)
- ✅ Files should be created in the specified output directory
- ✅ Script files should contain properly formatted JSON with timestamps
- ✅ Audio files should be concatenated MP3 files with individual segments
- ✅ Error messages should be helpful and descriptive
- ✅ Audio concatenation should eliminate noise between segments
Validation Checklist
- [ ] Topic validation (5-200 characters)
- [ ] Duration validation (5 or 10 minutes only)
- [ ] Output directory creation
- [ ] File naming convention
- [ ] Progress indicators display
- [ ] Error handling for various edge cases
🤝 Contributing
We welcome contributions! Please follow these guidelines:
Getting Started
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes
- Test thoroughly using the testing instructions above
- Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
Development Guidelines
Code Style
- Use ES6+ JavaScript features
- Follow existing code patterns and structure
- Add comments for complex logic
- Use descriptive variable and function names
Testing
- Test all new features manually using the local testing instructions
- Ensure error handling works correctly
- Verify output file formats are correct
- Test edge cases and invalid inputs
Commit Messages
- Use clear, descriptive commit messages
- Start with a verb (Add, Fix, Update, etc.)
- Keep the first line under 50 characters
- Add detailed description if needed
Types of Contributions
- 🐛 Bug fixes: Fix issues with existing functionality
- ✨ New features: Add new capabilities to the generator
- 📚 Documentation: Improve README, code comments, or guides
- 🎨 Code quality: Refactoring, optimization, or style improvements
- 🧪 Testing: Add or improve test coverage
Future Enhancement Areas
- Additional voice options and personalities
- Background music and sound effects
- Web interface
- Multi-language support
- Custom voice cloning
- Batch processing capabilities
- Real-time streaming generation
- Enhanced audio post-processing
Code Review Process
- All PRs require at least one review
- Ensure all tests pass
- Check for code style consistency
- Verify documentation updates if needed
- Test the changes locally
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Built with Node.js and Commander.js
- Inspired by thoughtful, reflective monologue formats
- Designed for educational and creative content generation
📞 Support
If you encounter any issues or have questions:
- Check the troubleshooting section
- Review existing issues in the repository
- Create a new issue with detailed information about the problem
- Include your Node.js version and operating system
Note: This implementation uses real API integrations with OpenAI for text generation and audio synthesis for advanced language processing.
