A single Gradio + React WebUI with extensions for ACE-Step
Synchronized Translation for Videos
A fast TTS architecture with conditional flow matching
The python library for real-time communication
Diffusion Transformer with Fine-Grained Chinese Understanding
One-click deployment (including offline integration package)
Speech-AI-Forge is a project developed around TTS generation model
Real-time voice interactive digital human
Unified Multimodal Understanding and Generation Models
A Web UI for easy subtitle using whisper model
From Images to High-Fidelity 3D Assets
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Multimodal Diffusion with Representation Alignment
RGBD video generation model conditioned on camera input
Open source template for AI-powered code generation apps w/ sandboxes
A colab gradio web UI for running Large Language Models
A webui for different audio related Neural Networks