Showing 58 open source projects for "vocal"

View related business solutions
  • Data management solutions for confident marketing Icon
    Data management solutions for confident marketing

    For companies wanting a complete Data Management solution that is native to Salesforce

    Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
    Learn More
  • Outbound sales software Icon
    Outbound sales software

    Unified cloud-based platform for dialing, emailing, appointment scheduling, lead management and much more.

    Adversus is an outbound dialing solution that helps you streamline your call strategies, automate manual processes, and provide valuable insights to improve your outbound workflows and efficiency.
    Learn More
  • 1
    Ultimate Vocal Remover (UVR5)

    Ultimate Vocal Remover (UVR5)

    GUI for a Vocal Remover that uses Deep Neural Networks

    This application uses state-of-the-art source separation models to remove vocals from audio files. UVR's core developers trained all of the models provided in this package (except for the Demucs v3 and v4 4-stem models).
    Downloads: 758 This Week
    Last Update:
    See Project
  • 2
    vocal-separate

    vocal-separate

    An extremely simple tool for separating vocals and background music

    vocal-separate is a simple but effective audio processing application that isolates vocals and instrumental tracks from music and video files using stem-based source separation models, enabling tasks such as karaoke creation, remixing, and music analysis. Built as a localized web-based tool, it runs entirely on the user’s machine without requiring an internet connection, emphasizing privacy and convenience for creative work.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    YuE

    YuE

    Open source AI model for generating full songs from lyrics prompts

    YuE is an open source project that provides a foundation model designed for full-song music generation using artificial intelligence. It focuses on transforming text inputs such as lyrics and genre prompts into complete musical compositions that include both vocal and instrumental tracks. Unlike many shorter audio generators, the model is capable of producing songs that last several minutes while maintaining coherent musical structure and alignment with the provided lyrics. YuE introduces a family of models built on large language model architectures that process music generation as a sequence prediction task. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    StemRoller

    StemRoller

    Isolate vocals, drums, bass, and other instrumental stems from songs

    StemRoller is the first free app that enables you to separate vocal and instrumental stems from any song with a single click! StemRoller uses Facebook's state-of-the-art Demucs algorithm for demixing songs and integrates search results from YouTube. Simply type the name/artist of any song into the search bar and click the Split button that appears in the results! You'll need to wait several minutes for splitting to complete.
    Downloads: 40 This Week
    Last Update:
    See Project
  • The AI-powered unified PSA-RMM platform for modern MSPs. Icon
    The AI-powered unified PSA-RMM platform for modern MSPs.

    Trusted PSA-RMM partner of MSPs worldwide

    SuperOps.ai is the only PSA-RMM platform powered by intelligent automation and thoughtfully crafted for the new-age MSP. The platform also helps MSPs manage their projects, clients, and IT documents from a single place.
    Learn More
  • 5
    Transcoder

    Transcoder

    Hardware-accelerated video transcoding using Android MediaCodec APIs

    ...Unlike traditional speech translation systems that rely on multi-stage pipelines, Transcoder directly translates one speaker’s video into another language while preserving facial expressions, lip-sync, and vocal identity. Designed for real-time use and production-grade pipelines, Transcoder combines advanced deep learning models with GPU acceleration to deliver high-quality translations across languages. It’s built with researchers and developers in mind, offering tools for testing, evaluating, and deploying AI-driven media localization.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    GPT-SoVITS

    GPT-SoVITS

    1 min voice data can also be used to train a good TTS model

    GPT‑SoVITS is a state-of-the-art voice conversion and TTS system that enables zero‑shot and few‑shot synthesis based on a short vocal sample (e.g., 5 seconds). It supports cross‑lingual speech synthesis across English, Chinese, Japanese, Korean, Cantonese, and more. It's powered by VITS architecture enhanced for few‑sample adaptation and real‑time usability.
    Downloads: 55 This Week
    Last Update:
    See Project
  • 7
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 8
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a...
    Downloads: 105 This Week
    Last Update:
    See Project
  • 9
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and produces speech accordingly, enabling natural dialogue, voice cloning, and expressive speech synthesis. Through its architecture, Step-Audio supports multilingual interaction, dialects, emotional tones (joy, sadness, etc.), and even more creative speech styles (like rap or singing), while allowing dynamic control over speech characteristics. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • The AI workplace management platform Icon
    The AI workplace management platform

    Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

    By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
    Learn More
  • 10
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    ...Code & examples provided with Hugging Face transformers, and usage via AutoProcessor, model classes etc. High performance on many standard benchmarks: ASR, speech-emotion recognition, vocal sound classification, speech translation etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Step-Audio 2

    Step-Audio 2

    Multi-modal large language model designed for audio understanding

    ...It integrates a latent-space audio encoder, discrete acoustic tokens, and reinforcement-learning–based training (CoT + RL) to enhance its ability to capture and reproduce voice styles, intonations, and subtle vocal cues. Moreover, Step-Audio2 supports tool-calling and retrieval-augmented generation (RAG), allowing it to access external knowledge sources or audio/text databases, thus reducing hallucinations and improving coherence in complex dialogues.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Fish Speech

    Fish Speech

    SOTA Open Source TTS

    ...Fish Speech emphasizes expressive and controllable voices: it supports a long list of emotion tags, tone markers, and special audio effect markers that can be embedded in the text to drive prosody and vocal style, from basic emotions to nuanced states like sarcastic, conciliative, or hysterical. The system is multilingual and cross-lingual, handling multiple languages in a single input without explicit phoneme markup, and is trained on large-scale datasets.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 13
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Free Karaoke File Maker

    Free Karaoke File Maker

    Free Karaoke File Maker

    You can hide the singer's voice from the music files that cannot hide the voice in the computer. By default, it will be saved with 2 audio tracks of singer + melody. If you want to save only the melody without the singer's voice, you have to select the No Vocal option. To save the output file, click Save Folder and choose the location you want to save (Default: Desktop). If you are sure of the above preparations, you can change the file you want to change by holding down the mouse and dragging it onto the Drag & Drop Input File. (No internet needed) You can also change it by clicking Select File.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    byzorgan

    byzorgan

    Specialized sound synthesizer with Byzantine Church music scales

    This software integrates a small, specialized synthesizer and vocal processor. It can be used to learn Byzantine Church singing. You can play from the keyboard, mouse or touch screen. MIDI input is also available. Voice functions include: pitch highlighting, synthesizer control by voice, pitch correction and voice-to-ison conversion. On the screen there are labels with symbols of Byzantine notes.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 16
    RunningCoachFr

    RunningCoachFr

    Coach personnel francophone de courses fractionnées en musique!

    ⭐ Découvrez votre Coach personnel de courses fractionnées en musique ! 🏃🎵 ✨Créez facilement une session d'entraînement sur mesure avec votre propre playlist en MP3, guidée par un coach vocal francophone. Ce programme dynamique vous permet de concevoir des séances personnalisées pour optimiser vos performances en course fractionnée, tout en étant motivé par des musiques adaptées. ✨ ⚠️ Important - Windows SmartScreen Lorsque vous lancez ce logiciel, Windows peut afficher un message "Windows a protégé votre ordinateur". ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. ...
    Downloads: 124 This Week
    Last Update:
    See Project
  • 18
    Kalliope

    Kalliope

    Kalliope is a framework to create your own personal assistant

    ...Kalliope is a framework that will help you to create your own personal assistant. The concept is to create the brain of your assistant by attaching an input signal (vocal order, scheduled event, MQTT message, GPIO event, etc..) to one or multiple actions called neurons. You can create your own Kalliope bot, by simply choosing and composing the existing neurons without writing any code. But, if you need a particular module, you can write it by yourself, add it to your project, and propose it to the community. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    PseudonymizeSpeech

    PseudonymizeSpeech

    Praat script to pseudonymize speech.

    ...There is a trade-off between the level of pseudonymization and the (para-)linguistic features retained. The approach is to manipulate the spectro-temporal structure of the speech to simulate a different length and structure of the vocal tract, as well as a different pitch and speaking rate. The method is deterministic, and partially reversible. The extend of the changes is adjustable and gradual.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    KuStudio

    KuStudio

    Minimalistic and superstable OSC timeline/sequencer

    KuStudio is an open source OSC timeline sequencer, recorder and player, aimed to create timeline on an audiotrack. It can be used as core timeline module in interactive audiovisual and dance/vocal performances. For installation instructions see KuStudio-Guide.pdf, included in KuStudio archives. For quick support write to perevalovds@gmail.com KuStudio lets create, record and OSC tracks, synchronized with given audio track. Audiotrack can be WAV or AIFF file. KuStudio is inspired by famous Duration OSC editor, but has different philosophy: KuStudio stores all OSC tracks as discrete arrays, not curves, that allows to record and edit them freely. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Notasi Angka

    Number Music Notation

    In Indonesia and other countries, number ("angka") music notation is used mainly for vocal music. This project will create number music notator, similar to Musescore to help inputting and playing number notation. Programmers, musicians, and graphics designers are welcomed to join this project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Zone:X

    Zone:X

    Use your voice's pitch to control the pitch of a space ship.

    Take song to to the next level by piloting this equalizer spacecraft with a spiralized wave form visualizer aura, using only your wits and tones. Easy to play, no experience necessary! To find out more visit zonex.rf.gd
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Winboard 4.5 Accessible Chess

    Winboard 4.5 Accessible Chess

    Chess for the Blind for the JAWS or NVDA Screen Readers

    Winboard 4.5 32-bit is a free Windows accessible Chess program that works automatically with the JAWS or the free NVDA screen reader. It is for the blind, low sighted or those who can not use a mouse. It provides vocal announcements of position changes and other selectable board conditions. Blind players also use a separate "tactile chess board". Winboard 4.5 has full keyboard access to move pieces and run menu items. Partial sighted players use high contrast mode and adjust board, piece, most font sizes and colors. Available languages are English, German, Spanish, Italian, Dutch and Russian. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24

    phpSheetMusic

    Manage your sheet music online

    phpSheetMusic is a simple content management system for sheet music with initial focus on vocal groups. It allows management of works, individual copies and the lifecycle of rental and return of copies to members.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Voice Choir Modulator with PureData

    Technique of Vocal Tract model, Subtractive synthesis + Effects

    This is a voice modulator implemented in PureData. Differents techniques are applied in order to find the best result for choir (Vocal tract, subtractive synthesis, AM, FM, etc) Also included some effects like Vibrato, Tremolo and Reverb. The folder also has a description of the project with block diagrams and an user manual. The modulator can also be used with a midi controller.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB