Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "delphi audio components"

x

Sort By:

Relevance

Clear All Filters

OS

Windows 30
Linux 25
Mac 21
More...
BSD 19
ChromeOS 15
Desktop Operating Systems 1
Mobile Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 19
Multimedia 8
Internet 3
Database 2
Scientific/Engineering 2
Software Development 2
System 2
Business 1
Communications 1
Desktop Environment 1
Games 1

License

OSI-Approved Open Source 29
Other License 1

Translations

English 9
French 1
German 1
Greek 1
More...
Italian 1
Japanese 1
Slovene 1

Programming Language

Python 31
C++ 7
Delphi/Kylix 5
C 3
Java 2
More...
Tcl 2
Visual Basic 2
C# 1
Dart 1
GLSL (OpenGL Shading Language) 1
Go 1
JavaScript 1
LabVIEW 1
Perl 1
PHP 1
Rust 1
Swift 1
TypeScript 1
Unix Shell 1

Status

Production/Stable 6
Beta 3
Planning 1

Showing 31 open source projects for "delphi audio components"

View related business solutions

Python Clear Filters & Widen Search

We help you deliver Virtual and Hybrid Events using our Award Winning end-to-end Event Management Platform
Designed by event planners for event planners, the EventsAIR platform gives you the ability to manage your event, conference, meeting or function with

EventsAIR have been anticipating and responding to the ever-changing event industry needs for over 30 years, providing innovative solutions that empower event organizers to create successful events around the globe.

Learn More
Safety Compliance Made Easy
SiteDocs is a digital safety management software used to support work site compliance.

Ideally designed for business that deals with Construction, Oil & Gas, Mining, Manufacturing, Mechanical, Electrical, Plumbing, Heating, and Excavating, SiteDocs is a perfect solution for any size business looking to modernize the way Safety Compliance is organized.

Learn More
1

Kimi-Audio

Audio foundation model excelling in audio understanding

Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components.

Downloads: 1 This Week

Last Update: 2026-01-27
See Project
2

Hugging Face - Speech To Speech

Open speech-to-speech models and pipelines by Hugging Face toolkit AI

This project from Hugging Face focuses on enabling direct speech-to-speech processing using modern machine learning models. It provides tools and reference implementations that allow audio input to be transformed into audio output without requiring an intermediate text representation. Hugging Face - Speech To Speech builds on recent advances in speech modeling, combining components such as speech recognition, translation, and synthesis into unified pipelines. It is designed to help researchers and developers experiment with multilingual and cross-lingual voice applications. ...

Downloads: 3 This Week

Last Update: 2026-03-18
See Project
3

Pipecat

Framework for building real-time voice and multimodal AI agents

...Developers can create a wide range of interactive systems including voice assistants, customer service agents, interactive storytelling applications, and multimodal interfaces that combine voice, video, images, and text. Its modular architecture allows components to be composed into pipelines that process audio, text, and video streams in real time.

Downloads: 9 This Week

Last Update: 2 days ago
See Project
4

Real-Time Voice Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Real-Time Voice Cloning is an influential deep-learning repository that demonstrates how to clone a voice from just a few seconds of audio and then generate arbitrary speech in that voice in near real time. It implements the SV2TTS pipeline (“Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis”) in three stages: a speaker encoder, a synthesizer, and a vocoder. In the first stage, short audio clips are converted into a fixed-dimensional speaker embedding that captures voice characteristics; this embedding is then used by a Tacotron-style synthesizer to generate spectrograms from text, which a WaveRNN-based vocoder finally turns into audio. ...

Downloads: 8 This Week

Last Update: 2026-03-09
See Project
Create custom docs, forms, apps, e-signatures, and surveys with Titan.
Powerful no-code digital experiences for Salesforce

Create custom docs, forms, apps, e-signatures, and surveys with Titan’s full-suite of enterprise applications designed to integrate seamlessly with Salesforce data across your entire organization. #1 on the Salesforce appexchange

Learn More
5

WhisperX

Automatic Speech Recognition with Word-level Timestamps

...Its architecture combines multiple components to enhance both performance and usability in real-world transcription tasks. Overall, whisperx provides a more robust and scalable solution for high-quality speech-to-text applications.

Downloads: 8 This Week

Last Update: 2026-04-06
See Project
6

YuE

Open source AI model for generating full songs from lyrics prompts

...It includes inference scripts, prompt examples, evaluation tools, and training components that enable researchers and developers to experiment with AI-based music.

Downloads: 8 This Week

Last Update: 2 days ago
See Project
7

VibeVoice ComfyUI

ComfyUI integration for Microsoft's VibeVoice text-to-speech model

VibeVoice ComfyUI is a comprehensive wrapper that integrates Microsoft’s VibeVoice text-to-speech models directly into ComfyUI workflows. It exposes VibeVoice as a set of custom nodes so you can build single-speaker and multi-speaker voice generation pipelines visually, combining TTS with other audio or generative components. The integration supports high-quality single-speaker synthesis as well as scripted multi-speaker conversations, with optional voice cloning from audio samples for each speaker. It includes advanced control over generation parameters like attention backend, diffusion steps, sampling temperature, guidance scale, and quantization settings, allowing users to tune the trade-offs between quality, VRAM usage, and speed. ...

Downloads: 8 This Week

Last Update: 2025-11-28
See Project
8

BasedHardware

Open source AI wearable platform for recording and summarizing speech

...Omi includes firmware for wearable hardware, a Flutter-based mobile companion application, backend services built with Python and FastAPI, and various SDKs for developers. These components work together to process audio, perform speech recognition, and integrate AI features such as summaries and automated actions. Developers can extend the platform by building plugins, integrations, and custom applications using provided SDKs and APIs. The repository also supports experimental hardware implementations.

Downloads: 8 This Week

Last Update: 4 hours ago
See Project
9

OmAgent

Build multimodal language agents for fast prototype and production

OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker...

Downloads: 7 This Week

Last Update: 2026-03-05
See Project
Transforming NetOps Through No-Code Network Automation - NetBrain
For anyone searching for a complete no-code automation platform for hybrid network observability and AIOps

NetBrain, founded in 2004, provides a powerful no-code automation platform for hybrid network observability, allowing organizations to enhance their operational efficiency through automated workflows. The platform applies automation across three key workflows: troubleshooting, change management, and assessment.

Learn More
10

LTX-Video

Official repository for LTX-Video

LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time and offline workflows in mind, enabling applications from consumer editing to professional content creation and batch processing. ...

Downloads: 10 This Week

Last Update: 2026-01-11
See Project
11

Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models

...Developers can customize voice output parameters like speed, pitch, and volume, and combine the TTS stack with other AI components.

Downloads: 12 This Week

Last Update: 2026-03-17
See Project
12

Instill Core

Instill Core is a full-stack AI infrastructure tool for data

...It provides an end-to-end solution that enables developers to build, deploy, and manage AI-powered applications without needing to manually stitch together multiple tools across the data and model lifecycle. The platform focuses heavily on handling unstructured data such as documents, images, audio, and video, transforming them into AI-ready formats through integrated ETL pipelines and processing workflows. Instill Core includes modular components such as pipelines, artifacts, and model services, which work together to enable flexible and scalable AI system design. It also supports retrieval-augmented generation workflows and model deployment without requiring complex GPU infrastructure management.

Downloads: 6 This Week

Last Update: 2026-03-19
See Project
13

WhatsApp MCP Server

WhatsApp MCP server enabling AI access to chats and messaging

...It acts as a bridge between WhatsApp and large language models, allowing controlled access to messages, chats, and contacts. whatsapp-mcp is composed of two main components: a Go-based bridge that connects to the WhatsApp Web API and stores data locally, and a Python-based MCP server that exposes tools for AI interaction. All message data is stored in a local SQLite database and is only accessed when explicitly requested through defined tools, giving users control over how their data is used. It supports both sending and receiving messages, including various media types such as images, audio, videos, and documents. ...

Downloads: 3 This Week

Last Update: 2026-03-17
See Project
14

CosyVoice

Multi-lingual large voice generation model, providing inference

CosyVoice is a multilingual large voice generation model that offers a full-stack solution for training, inference, and deployment of high-quality TTS systems. The model supports multiple languages, including Chinese, English, Japanese, Korean, and a range of Chinese dialects such as Cantonese, Sichuanese, Shanghainese, Tianjinese, and Wuhanese. It is designed for zero-shot voice cloning and cross-lingual or mix-lingual scenarios, so a single reference voice can be used to synthesize speech...

Downloads: 5 This Week

Last Update: 2025-11-30
See Project
15

Omnilingual ASR

Omnilingual ASR Open-Source Multilingual SpeechRecognition

Omnilingual-ASR is a research codebase exploring automatic speech recognition that generalizes across a very large number of languages using shared modeling and training recipes. It focuses on leveraging self-supervised audio pretraining and scalable fine-tuning so low-resource languages can benefit from high-resource data. The project provides data preparation pipelines, training scripts, decoding utilities, and evaluation tools so researchers can reproduce results and extend to new...

Downloads: 1 This Week

Last Update: 2025-12-12
See Project
16

Jina-Serve

Build multimodal AI applications with cloud-native stack

Jina Serve is an open-source framework designed for building, deploying, and scaling AI services and machine learning pipelines in production environments. The framework allows developers to create microservices that expose machine learning models through APIs that communicate using protocols such as HTTP, gRPC, and WebSockets. It is built with a cloud-native architecture that supports deployment on local machines, containerized environments, or large orchestration platforms such as...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
17

Multimodal

TorchMultimodal is a PyTorch library

...The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference implementations you can adopt or adapt. The design emphasizes composability: you can mix and match encoder, fusion, and decoder components rather than starting from monolithic models. ...

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
18

Snap7

32/64 bit multi-platform Ethernet S7 PLC communication suite

Snap7, through three specialized components: Client and the inedited Server and Partner, allows you to definitively integrate your PC based systems into a PLC automation chain. Designed to transfer large amounts of hi-speed data in industrial facilities, it scales easily, down to small Linux Arm boards such as Raspberry PI. Hi level object oriented wrappers are provided, currently C/C++, .NET/Mono, Pascal, LabVIEW, Python with many source code examples. Very easy to use, a full...

26 Reviews

Downloads: 923 This Week

Last Update: 2025-06-24
See Project
19

vocal-separate

An extremely simple tool for separating vocals and background music

...Users can drag and drop an audio or video file onto the interface to begin separation, choosing between two, four, or five stems, which allows isolating specific components like vocals, bass, drums, or piano depending on the chosen model. After processing, the tool outputs separate WAV files for each extracted stem, making it easy to export and use in audio editing or remix software.

Downloads: 7 This Week

Last Update: 2026-02-17
See Project
20

Ascoos Web Server

Is a web server for all Web Developers and Web Designers

For PHP 5.6 - 8.4.X see: Ascoos Web Extended Studio (AWES) is here : https://sourceforge.net/projects/ascoos-web-extended-studio/ ASCOOS Web Server is a rich package designed as a versatile web server for development purposes. It incorporates third-party components such as PHP, MySQL, pgSQL, MongoDB and FileZilla and stands out through a compact setup and a well-built administrative panel. ASCOOS Web Server allows you to work with multiple versions of PHP and MySQL without having to...

4 Reviews

Downloads: 1 This Week

Last Update: 2025-04-04
See Project
21

bitfarm-Archiv Document Management - DMS

bitfarm-Archiv is a powerful Document Management (DMS), Enterprise Content Management (ECM) and Knowledge Management System (KMS) with Workflow Components. Help us! As we live in the internet age, the best thing, you can help, is to write a short statement about your scenario and your use of the DMS, along with your experiences and put it on your own website or in a blog or forum. It would help us best, if you can also add a hyperlink to our site http://www.bitfarm-archiv.com. By this...

11 Reviews

Downloads: 10 This Week

Last Update: 2 days ago
See Project
22

Riffusion

Real-time music generation using stable diffusion techniques AI

...Riffusion (hobby) serves as the core implementation for audio and image processing, providing essential building blocks for generating music from text prompts. It includes both developer-oriented tools and user-facing components such as a command-line interface and an interactive Streamlit application for experimentation. Additionally, it can run as a Flask server to expose model inference through an API, enabling integration with other applications or services.

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
23

uncaptcha

Defeating Google's audio reCaptcha with 85% accuracy

...It employs signal processing techniques such as segmenting audio clips into individual components before transcription, which improves accuracy in noisy or complex audio conditions. The project was developed as part of academic research to highlight potential weaknesses in CAPTCHA systems and includes disclaimers emphasizing responsible use. While it achieved high success rates at the time of publication, later updates to reCAPTCHA have reduced its effectiveness.

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
24

autoradio

Radio automation software. Starting from media files manages broadcasting over a radio-station. The main components are: Player, Scheduler and WEB user interface. Developed with Python, Django, Xmms it works in an production environment

1 Review

Downloads: 0 This Week

Last Update: 2017-06-12
See Project
25

Demonstock

A GUI foundation for creating U/I based content

Demonstock is a GUI foundation for creating user interfaces and other graphic content. It features a library of tools and functions that are used to build GUI elements and components. Based upon both a QT and VisualBasic design, Demonstock is a flexible GUI editor with plugin and extension capability.

Downloads: 0 This Week

Last Update: 2013-05-29
See Project

Previous
You're on page 1
2
Next

Related Searches

snap7-full-1.4.2.zip

snap7

snap7-full-1.4.2.7z

snap7.dll

snap7-full-2.0.2-x64.zip

snap7-full-1.4.2

snap7-full

tts

siemens s7 plc simulator

voice cloning

Related Categories

Artificial Intelligence

Multimedia

Internet

Database

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise