Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Text to Speech Software
Search Results

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Mac 70
Linux 69
Windows 68
More...
BSD 39
ChromeOS 33
Mobile Operating Systems 7
Desktop Operating Systems 1

Category

Artificial Intelligence 70
Multimedia 6
Business 1
Internet 1
Scientific/Engineering 1
Text Editors 1

License

OSI-Approved Open Source 62

Translations

English 2
Arabic 1

Programming Language

Python 43
JavaScript 3
TypeScript 3
C# 2
More...
Java 2
ASP.NET 1
C 1
Cold Fusion 1
Go 1
PHP 1

Status

Production/Stable 6
Beta 2
Planning 1
Pre-Alpha 1
More...
Alpha 1

Showing 70 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

View related business solutions

Text to Speech Mac Clear Filters & Widen Search

Easy-to-use online form builder for every business.
Create online forms and publish them. Get an email for each response. Collect data.

Easy-to-use online form builder for every business. Create online forms and publish them. Get an email for each response. Collect data. Design professional looking forms with JotForm Online Form Builder. Customize with advanced styling options to match your branding. Speed up and simplify your daily work by automating complex tasks with JotForm’s industry leading features. Securely and easily sell products. Collect subscription fees and donations. Being away from your computer shouldn’t stop you from getting the information you need. No matter where you work, JotForm Mobile Forms lets you collect data offline with powerful forms you can manage from your phone or tablet. Get the full power of JotForm at your fingertips. JotForm PDF Editor automatically turns collected form responses into professional, secure PDF documents that you can share with colleagues and customers. Easily generate custom PDF files online!

Learn More
Smarter Packing Decisions for Retailers and 3PLs
Paccurate is an API-first cartonization solution.

Paccurate is the only patented cartonization solution that optimizes for transportation costs directly. So you can have the right boxes, and control how they're packed.

Learn More
1

NVIDIA NeMo

Toolkit for conversational AI

...NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. ...

Downloads: 3 This Week

Last Update: 2026-03-23
See Project
2

NVIDIA NeMo Framework

Scalable generative AI framework built for researchers and developers

...NeMo 2.0 introduces a Python-based configuration system, replacing YAML with more flexible, programmable configs that can be versioned and composed for different experiments. The framework builds on PyTorch Lightning–style modular abstractions, so training scripts are composed from reusable components for data loading, models, optimizers, and schedulers, which simplifies experimentation and adaptation. NeMo is designed to scale: with tools like NeMo-Run, users can orchestrate large-scale experiments across thousands of GPUs.

Downloads: 2 This Week

Last Update: 2026-03-23
See Project
3

KrillinAI

Video translation and dubbing tool powered by LLMs

KrillinAI is an end-to-end content localization, translation, and dubbing tool aimed at helping creators transform videos into multiple languages with minimal manual effort. It integrates several stages of the pipeline: video acquisition (either from local files or remote via download tools), speech recognition (ASR), subtitle segmentation and alignment, machine translation (with context-aware translation to preserve semantics), and voice cloning + text-to-speech (TTS) to produce dubbed audio tracks. KrillinAI supports both landscape and portrait videos, which makes it suitable for a wide range of platforms — from YouTube to TikTok or other vertical-video sites — and ensures correct formatting and layout for the final video. ...

Downloads: 7 This Week

Last Update: 2025-11-28
See Project
4

AI Runner

Offline inference engine for art, real-time voice conversations

...It is implemented as a desktop-oriented Python application and emphasizes privacy and self-hosting, allowing users to work with text-to-speech, speech-to-text, text-to-image and multimodal models without sending data to external services. At the core of its LLM stack is a mode-based architecture with specialized “modes” such as Author, Code, Research, QA and General, and a workflow manager that automatically routes user requests to the right agent based on the task. The project has a strong focus on developer ergonomics, with thorough development guidelines, environment configuration using .env variables, and a clear structure for tests, tools and agents.

Downloads: 10 This Week

Last Update: 2025-12-11
See Project
All Things Performance and Partner Marketing, All in One Place
Track calls, leads, and clicks without the manual work

Automatically tie revenue back to campaigns, channels, publishers, and networks through marketing attribution. Spend less time juggling reports, and more time optimizing for growth by using a single operating solution for partner and performance marketing.

Learn More
5

Audiblez

Generate audiobooks from e-books

...It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained on under 100 hours of audio, and supports multiple languages, including English (US/UK), Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, and Mandarin Chinese. Audiblez can run entirely from the command line via a PyPI package or through a simple cross-platform GUI built on wxPython, giving both advanced users and non-technical users an accessible workflow.

Downloads: 3 This Week

Last Update: 2025-11-30
See Project
6

MetaVoice-1B

Foundational model for human-like, expressive TTS

...Specifically, the base model (MetaVoice-1B) uses around 1.2 billion parameters and has been trained on a massive dataset — reportedly around 100,000 hours of speech data. The goal is to provide human-like, expressive, and flexible TTS: able to generate natural-sounding speech that can handle diverse inputs and likely generalize over voice styles, intonation, prosody, and perhaps multiple languages or accents. With that scale and dataset volume, MetaVoice aims to push the boundary of what open-source TTS models can achieve: high fidelity, natural prosody, and robustness even for edge cases. ...

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
7

EasyVoice

Open source text-to-speech tool, supports extra-long text

...The system supports multi-role voice acting, letting users assign different neural voices to different characters or narrative roles and configure parameters such as rate, pitch, and volume per role. It offers streaming playback so audio starts almost immediately, even for very long inputs, and automatically generates subtitle files suitable for video production or translation workflows. Under the hood, easyVoice uses a modern stack with Vue 3 and Element Plus on the front end, Node.js and Express on the back end, and TTS engines such as Microsoft Azure TTS and OpenAI-compatible APIs, orchestrated through ffmpeg.

Downloads: 2 This Week

Last Update: 2026-01-26
See Project
8

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper

...The project aims to be for speech what Stable Diffusion is for images: powerful, hackable, and safe for commercial use, with code under Apache-2.0/MIT and models trained only on properly licensed data. Its architecture follows a token-based, multi-stage pipeline inspired by AudioLM and SPEAR-TTS: Whisper is used to produce semantic tokens, EnCodec compresses the waveform into acoustic tokens, and Vocos reconstructs high-fidelity audio from those tokens. The repository includes notebooks and scripts for inference, long-form synthesis, and finetuning, as well as pre-trained models and converted datasets hosted on Hugging Face. ...

Downloads: 2 This Week

Last Update: 2025-11-28
See Project
9

Matcha-TTS

A fast TTS architecture with conditional flow matching

...The model is fully probabilistic, so it can generate diverse realizations of the same text while still sounding stable and intelligible. The repository provides an end-to-end TTS pipeline: a PyTorch/Lightning training stack, configuration files, pre-trained checkpoints, a command-line interface, and a Gradio app for interactive testing. Users can train on standard datasets like LJSpeech or plug in their own corpora, with helper tools for computing dataset statistics, extracting phoneme durations, and running multi-GPU training.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
Ecwid is a hosted cloud commerce platform used by over 1.5 million merchants and offers the easiest way to add an online store to any website, social site or multiple sites simultaneously.
Your free online store is just a few clicks away.

Set up your Ecwid store once to easily sync and sell across a website, social media, marketplaces like Amazon, and live in-person. Get started with one, or try them all.

Start Selling
10

MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server

...The server is written in Python and distributed under the MIT license, with a pyproject.toml and uv-based workflow that makes installation and execution reproducible. Configuration is handled through JSON files that tell MCP clients how to launch the server (typically via uvx minimax-mcp) and which environment variables to use for the API key, host, and output directory. The README carefully explains region-specific API hosts for global and mainland users to avoid invalid-key errors, and documents both local stdio transport and SSE-based network transport modes.

Downloads: 1 This Week

Last Update: 2026-01-07
See Project
11

Lingvo

Framework for building neural networks

...Lingvo includes reference models and configurations for domains like machine translation, automatic speech recognition, language modeling, image understanding, and 3D object detection. Centralized hyperparameter configuration files allow researchers to share exact experiment setups so others can retrain and compare results reliably.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
12

StyleTTS 2

Towards Human-Level Text-to-Speech through Style Diffusion

...StyleTTS2 supports both single-speaker and multi-speaker configurations, with the ability to sample or transfer styles from reference audio, making it powerful for expressive TTS and character voices. The repository includes training scripts, configuration files, and pre-trained auxiliary modules such as a text aligner, pitch extractor, and PL-BERT-based linguistic encoder.

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
13

Voice Accounting For Blind & Mute People

Free & Easy AI Voice Accounting Software For Blind & Speechless People

Just download the above zip file, extract it and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, please view and download my full collection of softwares for people with disabilities, here : https://sourceforge.net/projects/softwares-for-disabled-people/ This full collection also includes the Voice Accounting Software as well.

Downloads: 0 This Week

Last Update: 2024-04-30
See Project
14

Softwares For Blind, Deaf, Handicap

Easy AI Softwares for Blind, Deaf, Handicapped, Disabled People

Just download the above zip file, extract it first and then open the index.html file on internet browsers like Firefox ( preferable ) or Google Chrome. Also, keep NumLock ON while using the Numeric Keypad of any Keyboard. Can also attach an external USB keyboard, with seperate Numeric Keypad, if required. I have added some general guidelines for students, using these softwares, on the Wiki Page of this website. Please refer them for more instructions.

Downloads: 0 This Week

Last Update: 2026-01-18
See Project
15

Bert-VITS2

VITS2 backbone with multilingual-bert

...The core idea is to use BERT-style contextual embeddings for text encoding while relying on a refined VITS2 architecture for acoustic generation and vocoding. The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training entrypoints for multi-GPU and multi-node setups. It provides emotional modeling through “emo embeddings,” allowing voices to be conditioned on different affective states during synthesis. Releases include optimizations for Japanese and English alignment, expanded training data, spec caching and pre-generation tools, as well as ONNX export for more lightweight inference deployments.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
16

VALL-E X

Open source implementation of Microsoft's VALL-E X zero-shot TTS model

VALL-E-X is an open-source implementation of Microsoft’s VALL-E X zero-shot text-to-speech model, focused on multilingual, cross-lingual voice cloning. It is capable of synthesizing speech in English, Chinese, and Japanese from text while mimicking the voice characteristics of a speaker given only a short 3–10 second prompt. The model attempts to match not just timbre, but also tone, pitch, emotion, and prosody of the reference audio, resulting in highly personalized output.

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
17

EmotiVoice

Multi-Voice and Prompt-Controlled TTS Engine

...EmotiVoice provides multiple ways to interact with it, including a web interface, a Docker image, an HTTP API (including an OpenAI-compatible TTS API), and Python scripts for batch synthesis. It also supports voice cloning with your own data, backed by recipes for popular datasets like DataBaker and LJSpeech, so you can train or adapt voices to custom personas.

Downloads: 5 This Week

Last Update: 2025-11-30
See Project
18

TTS-Vue

Microsoft speech synthesis tool, built with Electron

...The app supports SSML (Speech Synthesis Markup Language), letting power users specify fine-grained control over pronunciation, pauses, prosody, and emphasis using XML-like markup. It includes batch conversion: users can select multiple .txt files and convert them into audio in one go, making it handy for large text collections or repetitive tasks. For long texts or big files, TTS-Vue automatically slices content into manageable segments, converts them separately, and then stitches them back into a single audio file, avoiding the usual length or timeout issues with TTS APIs.

Downloads: 54 This Week

Last Update: 2025-11-28
See Project
19

ekho

Chinese text-to-speech engine

ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. The code structure implies that Ekho may support hooking into audio input/output streams, perhaps for tasks like audio capture, playback, transformation, or simple voice-based operations. ...

Downloads: 7 This Week

Last Update: 2025-11-28
See Project
20

Open Speech Corpora

A list of accessible speech corpora for ASR, TTS

...The repository is organized as a set of tables that list corpora along with their languages, total hours, number of speakers, download links, and licenses, giving practitioners a quick way to find data that matches their needs. It emphasizes free and truly “open” datasets, favoring those released under Creative Commons or community-friendly data licenses, though it also lists corpora that are accessible for research and many commercial uses. The catalog covers well-known resources such as Mozilla Common Voice, Yesno, LJ Speech and numerous Nordic and parliamentary speech corpora, along with their license variants like CC-0 and CC-BY. ...

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
21

WaveRNN

WaveRNN Vocoder + TTS

...The repository includes scripts and code for preprocessing datasets such as LJSpeech, training Tacotron to produce mel spectrograms, training WaveRNN on those spectrograms (with optional GTA data), and finally generating audio. A quick_start.py script allows users to immediately synthesize example sentences from a pretrained model and inspect both generated audio and attention plots. For custom TTS, the project guides you through training Tacotron, forcing GTA spectrogram export when desired, training WaveRNN with or without GTA, and then running joint generation.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project
22

edge-TTS-record

Tool that can record speech synthesis

edge-TTS-record is a Windows-based tool that records speech synthesized by the Microsoft Edge browser’s online TTS voices and saves the result as .wav audio files. The idea is simple but effective: since Edge’s online TTS voices (such as “Xiaoxiao” or “Yunyang” for Chinese) are often high-quality, this tool provides a way to “capture” them offline for later use. Users can type or paste text, preview the speech, and then trigger the recorder; the system automatically captures the audio output from the browser and writes it to a WAV file. ...

Downloads: 3 This Week

Last Update: 2025-11-28
See Project
23

Mocking Bird

Clone a voice in 5 seconds to generate arbitrary speech in real-time

MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....

1 Review

Downloads: 2 This Week

Last Update: 2023-03-23
See Project
24

TTS4Anki2

Program that creates MP3 files for ANKI flashcards using Google Text-To-Speech engine

Downloads: 0 This Week

Last Update: 2021-09-18
See Project
25

Transformer TTS

Implementation of a Transformer based neural network

...This design addresses common autoregressive issues such as repetition, skipped words, and unstable attention, and results in robust, fast synthesis where all frames are predicted in parallel. The repository ships with tooling to build datasets (especially LJSpeech) and create training data, plus scripts to train both the aligner and the TTS model, monitor training with TensorBoard, and resume or reset training runs.

Downloads: 0 This Week

Last Update: 2025-11-28
See Project

Previous
1
You're on page 2
3
Next

Related Searches

ai

ekho

dubbing

ai offline

ai chatbot offline

nvidia

offline ai

jarvis voice hindi

index.html download

speech

Related Categories

Artificial Intelligence

Multimedia

Business

Internet

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise