8 projects for "transcribe" with 2 filters applied:

  • Network Discovery Software | JDisc Discovery Icon
    Network Discovery Software | JDisc Discovery

    JDisc Discovery supports the IT organizationss of medium-sized businesses and large-scale enterprises.

    JDisc Discovery is a comprehensive network inventory and IT asset management solution designed to help organizations gain clear, up-to-date visibility into their IT environment. It automatically scans and maps devices across the network, including servers, workstations, virtual machines, and network hardware, to create a detailed inventory of all connected assets. This includes critical information such as hardware configurations, software installations, patch levels, and relationshipots between devices.
    Learn More
  • Unrivaled Embedded Payments Solutions | NMI Icon
    Unrivaled Embedded Payments Solutions | NMI

    For SaaS builders, software companies, ISVs and ISOs who want to embed payments into their tech stack

    NMI Payments is an embedded payments solution that lets SaaS platforms, Software companies and ISVs integrate, brand, and manage payment acceptance directly within their software—without becoming a PayFac or building complex infrastructure. As a full-stack processor, acquirer, and technology partner, NMI handles onboarding, compliance, and risk so you can stay focused on growth. The modular, white-label platform supports omnichannel payments, from online, mobile and in-app to in-store and unattended. Choose from full-code, low-code, or no-code integration paths and launch in weeks, not months. Built-in risk tools, flexible monetization, and customizable branding help you scale faster while keeping full control of your experience. With NMI’s developer-first tools, sandbox testing, and modern APIs, you can embed payments quickly and confidently.
    Learn More
  • 1
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    AutoSubs is an open-source, AI-powered subtitle generation tool that enables users to automatically transcribe audio and video content into accurate, editable subtitles directly on their device. It supports both standalone usage and integration with professional video editing software such as DaVinci Resolve, allowing creators to generate and edit subtitles within their existing workflows. The tool leverages speech-to-text models, including OpenAI Whisper, to produce high-quality transcriptions and can differentiate between speakers using diarization techniques. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 2
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    ...It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. It can also translate subtitles into other languages while preserving the original timing, making it suitable for multilingual video publishing and accessibility. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    BasedHardware

    BasedHardware

    Open source AI wearable platform for recording and summarizing speech

    ...It combines hardware, firmware, mobile applications, and backend services to create a complete ecosystem for voice-driven interaction. Users can connect the wearable device to a mobile phone and automatically record and transcribe meetings, conversations, and voice memos. Omi includes firmware for wearable hardware, a Flutter-based mobile companion application, backend services built with Python and FastAPI, and various SDKs for developers. These components work together to process audio, perform speech recognition, and integrate AI features such as summaries and automated actions. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    ShortGPT

    ShortGPT

    AI framework for automated short video creation and editing tools

    ShortGPT is an experimental AI-powered framework designed to automate the creation of short-form and long-form video content. It provides a structured system that handles multiple stages of the content creation workflow, including script generation, asset sourcing, voiceover synthesis, and video editing. ShortGPT uses large language models to generate scripts and prompts that guide the automated editing and production process. ShortGPT includes specialized content engines that manage...
    Downloads: 5 This Week
    Last Update:
    See Project
  • The training management software of choice for commercial training providers who want to save time and crush sales targets. Icon
    The training management software of choice for commercial training providers who want to save time and crush sales targets.

    On average, Arlo clients reduce administration by 43% and grow registrations by 76%.

    Arlo is training management software for training providers who want to save time and crush sales targets. It is a complete training management solution to promote, sell and deliver instructor-led, online and blended learning. Arlo takes care of your public training schedule, private in-house courses and ongoing training contracts, so you to manage your whole business in one system. It saves you time by automating manual processes and helps you grow with slick ecommerce and marketing tools. See for yourself by starting a free trial.
    Try for Free
  • 5
    Groq TypeScript / Node.s

    Groq TypeScript / Node.s

    The official Node.js / Typescript library for the Groq API

    ...The library also supports passing different input types (file streams, blobs, fetch responses) for media-related endpoints, making it flexible for diverse environments (backend, browser, serverless). With this SDK, developers can call Groq’s models, transcribe audio, perform file uploads — all with minimal boilerplate — which streamlines creation of AI-enabled applications in the JavaScript/TypeScript ecosystem.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Piano transcription

    Piano transcription

    Task of transcribing piano recordings into MIDI files

    ...The authors used this system to build a large-scale classical piano MIDI dataset (see next project), but as a standalone tool it enables researchers, musicians, or hobbyists to transcribe their own piano recordings automatically.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    wav2vec2-large-xlsr-53-portuguese

    wav2vec2-large-xlsr-53-portuguese

    Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input

    wav2vec2-large-xlsr-53-portuguese is an automatic speech recognition (ASR) model fine-tuned on Portuguese using the Common Voice 6.1 dataset. It is based on Facebook’s wav2vec2-large-xlsr-53, a multilingual self-supervised learning model, and is optimized to transcribe Portuguese speech sampled at 16kHz. The model performs well without a language model, though adding one can improve word error rate (WER) and character error rate (CER). It achieves a WER of 11.3% (or 9.01% with LM) on Common Voice test data, demonstrating high accuracy for a single-language ASR model. Inference can be done using HuggingSound or via a custom PyTorch script using Hugging Face Transformers and Librosa. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    wav2vec2-large-xlsr-53-russian

    wav2vec2-large-xlsr-53-russian

    Russian ASR model fine-tuned on Common Voice and CSS10 datasets

    ...It was trained using Mozilla’s Common Voice 6.1 and CSS10 datasets to recognize Russian speech with high accuracy. The model operates best with audio sampled at 16kHz and can transcribe Russian speech directly without a language model. It achieves a Word Error Rate (WER) of 13.3% and Character Error Rate (CER) of 2.88% on the Common Voice test set, with even better results when used with a language model. The model supports both PyTorch and JAX and is compatible with the Hugging Face Transformers and HuggingSound libraries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB