Audio & Voice¶
Speech & Recognition¶
- speech recognition - ASR models, transcription, pronunciation assessment
Text-to-Speech¶
- tts models - TTS model comparison, latency benchmarks, multilingual support
- voice cloning - Voice cloning, voice mixing, naturalness benchmarks
- voice conversion - Voice conversion techniques and pipelines
- audio generation - Audio generation models and workflows
Voice Applications¶
- voice agent pipelines - Voice agent pipelines and frameworks for real-time applications
- podcast processing - Podcast processing, transcription, and analysis