A text-to-speech, speech-to-text and speech-to-speech library
Crowdsourcing platform for full text transcription and tagging
Text transcription & slicing tool with visual timeline and WAV output.
Chat & pretrained large audio language model proposed by Alibaba Cloud
A free, open source, and extensible speech-to-text application
Speech recognition module for Python
Repo of Qwen2-Audio chat & pretrained large audio language model
Audio foundation model excelling in audio understanding
Transcribe any audio to text, translate and edit subtitles 100% locall
Open-source framework for intelligent speech interaction
Voice Recognition to Text Tool
LLM-based Reinforcement Learning audio edit model
Oobabooga - The definitive Web UI for local AI, with powerful features
Transcribe and translate audio offline on your personal computer
A gallery that showcases on-device ML/GenAI use cases
Translate the video from one language to another and embed dubbing
Multi-modal large language model designed for audio understanding
AI-powered tool for generating, optimizing, and translating subtitles
Instantly generate AI-powered subtitles on your device
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Audiocraft is a library for audio processing and generation
Generate audiobooks from EPUBs, PDFs and text with captions
Code for openai.fm, a demo for the OpenAI Speech API
LilyPond sheet music text editor