Showing 47 open source projects for "gradio"

View related business solutions
  • Accounting practice management software Icon
    Accounting practice management software

    Accountants, accounting firms, tax attorneys, tax professionals

    Canopy is a cloud-based practice management software for accounting and tax firms, offering tools for client engagement, document management, workflow automation, and time & billing. Its Client Engagement platform centralizes interactions with a secure portal, customizable branding, and email integration, while the Document Management system enables organized, paperless file storage. The Workflow module enhances visibility into tasks and projects through templates, task assignments, and automation, reducing human error. Additionally, the Time & Billing feature tracks billable hours, generates invoices, and processes payments, ensuring accurate financial management. With its comprehensive features, Canopy streamlines operations, reduces stress, and enhances client experiences.
    Learn More
  • Easily build robust connections between Salesforce and any platform Icon
    Easily build robust connections between Salesforce and any platform

    We help companies using Salesforce connect their data with a no-code Salesforce-native solution.

    Like having Postman inside Salesforce! Declarative Webhooks allows users to quickly and easily configure bi-directional integrations between Salesforce and external systems using a point-and-click interface. No coding is required, making it a fast and efficient and as a native solution, Declarative Webhooks seamlessly integrates with Salesforce platform features such as Flow, Process Builder, and Apex. You can also leverage the AI Integration Agent feature to automatically build your integration templates by providing it with links to API documentation.
    Learn More
  • 1
    Gradio

    Gradio

    Create UIs for your machine learning model in Python in 3 minutes

    Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere! Gradio can be installed with pip. Creating a Gradio interface only requires adding a couple lines of code to your project. You can choose from a variety of interface types to interface your function. Gradio can be embedded in Python notebooks or presented as a webpage.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    Fooocus

    Fooocus

    Focus on prompting and generating

    Fooocus is an open-source image generation software that simplifies the process of creating images from text prompts. Built on Gradio and leveraging Stable Diffusion XL, Fooocus eliminates the need for manual parameter tweaking, allowing users to focus solely on crafting prompts. It offers a user-friendly interface with minimal setup, making advanced image synthesis accessible to a broader audience.
    Downloads: 250 This Week
    Last Update:
    See Project
  • 3
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    Voice-Pro

    Voice-Pro

    Comprehensive Gradio WebUI for audio processing

    Voice-Pro is the best gradio WebUI for transcription, translation and text-to-speech. It can be easily installed with one click. Create a virtual environment using Miniconda, running completely separate from the Windows system (fully portable). Supports real-time transcription and translation, as well as batch mode.
    Downloads: 38 This Week
    Last Update:
    See Project
  • Multi-Entity Cloud Accounting Software for Growing Businesses Icon
    Multi-Entity Cloud Accounting Software for Growing Businesses

    Built for small to midsize businesses that have outgrown entry-level accounting or legacy ERP solutions.

    Built natively on the Microsoft Power Platform (Dynamics 365), Gravity delivers robust multi-entity financial management with seamless integration to Microsoft 365, Power BI, Teams + Copilot — no third-party add-ons required.
    Learn More
  • 5
    Applio

    Applio

    A simple, high-quality voice conversion tool focused on ease of use

    Applio is a high-quality voice conversion toolkit designed to make modern RVC/VITS-based voice cloning accessible to non-experts. It focuses strongly on ease of use: installation scripts for Windows, Linux, and macOS set up dependencies and then launch a browser-based Gradio interface. Within that interface, users can train and run voice conversion models for tasks like singing conversion, speech-to-speech transformation, and voice cloning. The project is structured to be flexible through plugins and configurations so users can extend functionality without touching the core code. Applio is considered stable and mature; ongoing development is now centered on security patches, dependency maintenance, and occasional improvements, which makes it attractive for production or repeatable workflows. ...
    Downloads: 94 This Week
    Last Update:
    See Project
  • 6
    Stable Diffusion web UI for AMDGPUs

    Stable Diffusion web UI for AMDGPUs

    Stable Diffusion WebUI optimized for AMD GPUs with editing tools

    Stable Diffusion WebUI AMDGPU is a browser-based interface for generating images using Stable Diffusion, built with Gradio and adapted for AMD graphics hardware. It provides both text-to-image and image-to-image workflows, allowing users to create, refine, and upscale visuals within a single interface. It includes tools such as inpainting and outpainting for editing specific areas of an image, along with features like prompt matrix generation and attention controls to fine-tune outputs. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 7
    SoniTranslate

    SoniTranslate

    Synchronized Translation for Videos

    SoniTranslate is a video translation and dubbing system that produces synchronized target-language audio tracks for existing video content. It provides a web UI built with Gradio, allowing users to upload a video, choose source and target languages, and then run a pipeline that handles transcription, translation and re-synthesis of speech. Under the hood, it uses advanced speech and diarization models to separate speakers, align audio with timecodes and respect subtitle timing, which lets the generated dub track stay in sync with the original video structure. ...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 8
    EPUB to Audiobook Converter

    EPUB to Audiobook Converter

    EPUB to audiobook converter, optimized for Audiobookshelf

    ...The project supports multiple TTS providers, including Microsoft Azure TTS, EdgeTTS, OpenAI TTS, local Piper, and Kokoro via an OpenAI-compatible endpoint, allowing users to choose between cloud and self-hosted voices. A recent addition is a Gradio-based WebUI, which wraps all configuration options in a graphical interface for users who prefer not to work with the command line. The tool offers advanced options such as controlling chapter ranges, handling paragraph detection via newline modes, removing endnote markers, and using regex-based search-and-replace files to tweak pronunciations. ...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 9
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    ...The repository provides an end-to-end TTS pipeline: a PyTorch/Lightning training stack, configuration files, pre-trained checkpoints, a command-line interface, and a Gradio app for interactive testing. Users can train on standard datasets like LJSpeech or plug in their own corpora, with helper tools for computing dataset statistics, extracting phoneme durations, and running multi-GPU training.
    Downloads: 16 This Week
    Last Update:
    See Project
  • Attack Surface Management | Criminal IP ASM Icon
    Attack Surface Management | Criminal IP ASM

    For security operations, threat-intelligence and risk teams wanting a tool to get access to auto-monitored assets exposed to attack surfaces

    Criminal IP’s Attack Surface Management (ASM) is a threat-intelligence–driven platform that continuously discovers, inventories, and monitors every internet-connected asset associated with an organization, including shadow and forgotten resources, so teams see their true external footprint from an attacker’s perspective. The solution combines automated asset discovery with OSINT techniques, AI enrichment and advanced threat intelligence to surface exposed hosts, domains, cloud services, IoT endpoints and other Internet-facing vectors, capture evidence (screenshots and metadata), and correlate findings to known exploitability and attacker tradecraft. ASM prioritizes exposures by business context and risk, highlights vulnerable components and misconfigurations, and provides real-time alerts and dashboards to speed investigation and remediation.
    Learn More
  • 10
    FastRTC

    FastRTC

    The python library for real-time communication

    ...This makes it particularly well suited for building real-time voice (or video) interfaces for applications such as AI assistants, live chat, or collaborative audio/video tools. FastRTC also integrates nicely with UI frameworks (e.g. via a web demo using Gradio), so developers can rapidly prototype and deploy real-time streaming applications without deep knowledge of low-level WebRTC internals. Because voice-enabled AI agents often involve many moving parts (speech-to-text, text processing, text-to-speech, streaming, session/chat management), FastRTC helps by handling the streaming aspect, leaving the rest to be plugged in modularly.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    ChatGLM3

    ChatGLM3

    ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat

    ...It keeps the series’ smooth dialog and low deployment cost while adding native tool use (function calling), a built-in code interpreter, and agent-style workflows. The family includes base and long-context variants (8K/32K/128K). The repo ships Python APIs, CLI and web demos (Gradio/Streamlit), an OpenAI-format API server, and a compact fine-tuning kit. Quantization (4/8-bit), CPU/MPS support, and accelerator backends (TensorRT-LLM, OpenVINO, chatglm.cpp) enable lightweight local or edge deployment.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    ...It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions. LoRA, ControlNet (pose, depth, canny), IP-adapter to extend control over generation. Integration with Gradio for web demos and diffusers / command-line compatibility. Supports multi-turn T2I (text-to-image) interactions so users can iteratively refine their images via dialogue.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ...It provides an integrated offline bundle and scripts for Windows and macOS so users can run ChatTTS locally without wrestling with complex environment setup. The repository includes Colab notebooks that launch a Gradio-based web UI and expose streaming TTS, making it possible to listen to generated audio as it is produced. A distinctive feature is the “voice gacha” system, which batch-generates many distinct voice timbres and allows users to save the ones they like into a curated voice library. It has first-class support for long-form audio generation, making it suitable for audiobooks, podcasts, or long narration tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    LLaMA Efficient Tuning

    LLaMA Efficient Tuning

    Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon

    Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, ChatGLM2)
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Speech-AI-Forge

    Speech-AI-Forge

    Speech-AI-Forge is a project developed around TTS generation model

    Speech-AI-Forge is a full-stack project built around modern text-to-speech generation models, providing both an API server and a Gradio-based web UI for interactive use. At its core, it acts as a hub that wires together multiple speech-related capabilities, including TTS, speech-to-text and LLM-based control flows, behind a consistent interface. The system is designed to be deployed in several ways: you can try it online via hosted demos, spin it up in a one-click Colab environment, run it in Docker containers, or set it up locally with its environment preparation scripts. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    ...By splitting those pathways but keeping one unified core transformer, Janus maintains flexibility and achieves strong performance across tasks previously requiring distinct architectures. The repository includes pretrained checkpoints (for example 1.3B and 7B parameter versions), a Gradio demo, and guidance for local deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    kotaemon

    kotaemon

    An open-source RAG-based tool for chatting with your documents

    An open-source clean & customizable RAG UI for chatting with your documents. Built with both end users and developers in mind. This project serves as a functional RAG UI for both end users who want to do QA on their documents and developers who want to build their own RAG pipeline.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    AUTOMATIC1111 Stable Diffusion web UI
    AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and Apple Silicon, plus support for GPUs and CPUs, it caters to a wide range of users—from hobbyists to professionals. ...
    Downloads: 320 This Week
    Last Update:
    See Project
  • 19
    VideoChat

    VideoChat

    Real-time voice interactive digital human

    ...It supports both pure end-to-end voice solutions based on multimodal large language models (GLM-4-Voice feeding directly into talking-head generation) and a more traditional cascaded pipeline using ASR → LLM → TTS → talking head. It is built as a Gradio Python demo, exposing a web interface where users can talk to an animated avatar that lip-syncs to synthesized speech while responding intelligently. The system is customizable: you can define your own avatar appearance and voice, and it supports voice cloning so you can generate a new voice from a short 3–10 second reference sample. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Chatterbox

    Chatterbox

    SoTA open-source TTS

    ...It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our Hugging Face Gradio app. If you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (link). It delivers reliable performance with ultra-low latency of sub-200ms—ideal for production use in agents, applications, or interactive media.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 21
    MagicTime

    MagicTime

    Time-lapse Video Generation Models as Metamorphic Simulators

    This repository is the official implementation of MagicTime, a metamorphic video generation pipeline based on the given prompts. The main idea is to enhance the capacity of video generation models to accurately depict the real world through our proposed methods and dataset. Compared to general videos, metamorphic videos contain physical knowledge, long persistence, and strong variation, making them difficult to generate.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Text Generation Web UI

    Text Generation Web UI

    Oobabooga - The definitive Web UI for local AI, with powerful features

    A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open Assistant formats. Nice HTML output for GPT-4chan.
    Downloads: 85 This Week
    Last Update:
    See Project
  • 23
    Whisper-WebUI

    Whisper-WebUI

    A Web UI for easy subtitle using whisper model

    Whisper WebUI is an open-source browser-based interface that simplifies the use of Whisper speech recognition models by providing an intuitive graphical environment for transcription, translation, and subtitle generation. Built with Gradio, it allows users to upload audio or video files, process them locally, and generate accurate text outputs without relying on command-line tools. The platform integrates optimized implementations such as faster-whisper, significantly improving transcription speed and reducing memory usage compared to standard models. It supports multiple input sources including local files, YouTube content, and microphone input, making it versatile for different workflows. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 24
    Hunyuan3D-2.1

    Hunyuan3D-2.1

    From Images to High-Fidelity 3D Assets

    Hunyuan3D-2.1 is Tencent Hunyuan’s advanced 3D asset generation system that produces high-fidelity 3D models with Physically Based Rendering (PBR) textures. It is fully open-source with released model weights, training, and inference code. It improves on prior versions by using a PBR texture pipeline (enabling realistic material effects like reflections and subsurface scattering) and allowing community fine-tuning and extension. It supports both shape generation (mesh geometry) and texture...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 25
    CogVideo

    CogVideo

    Text and image to video generation: CogVideoX and CogVideo

    CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...
    Downloads: 18 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB