Showing 60 open source projects for "transcribe mp4 to text"

View related business solutions
  • Enterprise Job Scheduling Software Icon
    Enterprise Job Scheduling Software

    Unify Enterprise Job Scheduling for Scale, Visibility, and Control

    Managing your sprawling data center and cloud with disparate native schedulers creates chaos. Achieve unparalleled control and efficiency over your entire IT environment with JAMS job orchestration tools. JAMS provides the singular, centralized platform required to overcome the complexities of disparate native schedulers. Automate, secure, and govern all your workloads, eliminating fragmented control, compliance risks, and operational bottlenecks. JAMS streamlines operations and ensures audit-ready history, transforming your enterprise automation with confidence and precision.
    Learn More
  • Cybersecurity Starts With Password Security. Icon
    Cybersecurity Starts With Password Security.

    Keeper is the top-rated password manager for protecting you, your family and your business from password-related data breaches and cyberthreats.

    Research shows that a whopping 81% of data breaches are due to weak or stolen passwords. Business password managers provide an affordable and simple way for companies to solve the single biggest root cause of most data breaches. By implementing Keeper, your business is significantly reducing the risk of a data breach.
    Get Started
  • 1
    Concordia

    Concordia

    Crowdsourcing platform for full text transcription and tagging

    Concordia is a platform for crowdsourcing transcription and tagging of text in digitized images. It was developed by the Library of Congress so that volunteers of all backgrounds could transcribe and tag digitized images of manuscripts and typed materials from the Library’s collections that could not otherwise be done by optical character recognition.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Transcripciones con Whisper Esta aplicación de escritorio basada en web permite transcribir (o transcribir y traducir al ingles), archivos de audio o video utilizando el modelo Whisper de OpenAI. Transcriptions with Whisper This web-based desktop application allows you to transcribe—or both transcribe and translate into English—audio or video files using OpenAI's Whisper model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Memvid

    Memvid

    Video-based AI memory library. Store millions of text chunks in MP4

    Memvid encodes text chunks as QR codes within MP4 frames to build a portable “video memory” for AI systems. This innovative approach uses standard video containers and offers millisecond-level semantic search across large corpora with dramatically less storage than vector DBs. It's self-contained—no DB needed—and supports features like PDF indexing, chat integration, and cloud dashboards.
    Downloads: 48 This Week
    Last Update:
    See Project
  • 4
    Handy STT

    Handy STT

    A free, open source, and extensible speech-to-text application

    Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active text field. ...
    Downloads: 89 This Week
    Last Update:
    See Project
  • ToogleBox: Simplify, Automate and Improve Google Workspace Functionalities Icon
    ToogleBox: Simplify, Automate and Improve Google Workspace Functionalities

    The must-have platform for Google Workspace

    ToogleBox was created as a solution to address the challenges faced by Google Workspace Super Admins. We developed a premium and secure Software-as-a-Service (SaaS) product completely based on specific customer needs. ToogleBox automates most of the manual processes when working with Google Workspace functionalities and includes additional features to improve the administrator experience.
    Learn More
  • 5
    stt

    stt

    Voice Recognition to Text Tool

    stt is a standalone speech recognition tool that locally converts spoken content in audio or video files into textual formats without requiring internet access, giving users control over their data and reducing reliance on external APIs. It leverages open-source speech models such as Faster-Whisper to recognize and transcribe human speech into plain text, structured JSON objects, or subtitle files with time codes, making it suitable for both personal and professional transcription tasks. The project is designed to be easy to deploy: you can run a local Python server that exposes an HTTP API for uploading audio/video files and retrieving transcriptions in different formats. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Bootleg Text Slicer

    Bootleg Text Slicer

    Text transcription & slicing tool with visual timeline and WAV output.

    ... - Record the timeline position, along with the global and per‑word timing offsets for each exported word, into a cutTemplate.txt file so that the individual words can later be played using only the source audio file. GitHub repository: https://github.com/Northstrix/bootleg-text-slicer Successfully tested with English and Italian audio files. Both scripts work, but I wouldn’t advise you to use Bootleg Text Slicer V2.py to transcribe more than 60–90 seconds at a time. Otherwise, its UI might become laggy. You can easily adjust the transcription duration by moving the start and end sliders below the timeline.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Vibe

    Vibe

    Transcribe on your own

    Vibe is an open-source project by thewh1teagle designed to deliver a collaborative and interactive social application experience, though its specifics depend on its evolving community scope; its development often focuses on connecting users through dynamic features that can include chat, shared spaces, and immersive interactions. The repository typically includes backend logic, frontend integration, and real-time communication stacks to support live user engagement, performance...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 8
    pyVideoTrans

    pyVideoTrans

    Translate the video from one language to another and embed dubbing

    pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 9
    Whishper

    Whishper

    Transcribe any audio to text, translate and edit subtitles 100% locall

    Open-source, local-first audio transcription and subtitling suite with a simple web UI. Thanks to open-source technologies, Whishper can run 100% offline. Your data never leaves your computer. Whishper allows you to translate your transcriptions to and from more than 60 languages thanks to Argos Translate and LibreTranslate. Download the transcriptions in many formats (json, txt, vtt, srt). Easily edit your subtitles right in the Web-UI.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Download the most trusted enterprise browser Icon
    Download the most trusted enterprise browser

    Chrome Enterprise brings enterprise controls and easy integrations to the browser users already know and love.

    Chrome Enterprise is ideal for businesses of all sizes, IT professionals, and organizations looking for a secure, scalable, and easily managed browser solution that supports remote work, data protection, and streamlined enterprise operations.
    Learn More
  • 10
    Google AI Edge Gallery

    Google AI Edge Gallery

    A gallery that showcases on-device ML/GenAI use cases

    Gallery is a curated collection of on-device machine learning examples, demo apps, and model artifacts designed to help developers experiment with and deploy ML at the edge. The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid...
    Downloads: 1,592 This Week
    Last Update:
    See Project
  • 11
    VideoCaptioner

    VideoCaptioner

    AI-powered tool for generating, optimizing, and translating subtitles

    ...It integrates speech recognition, language processing, and translation technologies to automatically generate and refine subtitles from video or audio sources. VideoCaptioner uses speech-to-text engines such as Whisper variants to transcribe spoken content and convert it into subtitle text with accurate timestamps. After transcription, large language models are used to intelligently restructure subtitles into natural sentences, correct wording, and improve readability for viewers. It can also translate subtitles into other languages while preserving the original timing, making it suitable for multilingual video publishing and accessibility. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 12
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 13
    Screenity

    Screenity

    The most powerful screen recorder & annotation tool for Chrome

    ...Annotate your screen to give feedback, emphasize your clicks, edit your recording, and much more. Make unlimited recordings of your tab, desktop, any application, and camera. Annotate by drawing anywhere on the screen, adding text, and creating arrows. Highlight your clicks, focus on your mouse, or hide it from the recording. Individual microphone and computer audio controls, push to talk, and more. Custom countdowns, show controls only on hover, and many other customization options. Export as mp4, gif, and webm, or save the video directly to Google Drive. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 14
    AutoSubs

    AutoSubs

    Instantly generate AI-powered subtitles on your device

    AutoSubs is an open-source, AI-powered subtitle generation tool that enables users to automatically transcribe audio and video content into accurate, editable subtitles directly on their device. It supports both standalone usage and integration with professional video editing software such as DaVinci Resolve, allowing creators to generate and edit subtitles within their existing workflows. The tool leverages speech-to-text models, including OpenAI Whisper, to produce high-quality transcriptions and can differentiate between speakers using diarization techniques. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 15
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Insanely Fast Whisper

    Insanely Fast Whisper

    An opinionated CLI to transcribe Audio files w/ Whisper on-device

    Insanely Fast Whisper is a high-performance command-line tool designed to dramatically accelerate speech-to-text transcription using OpenAI’s Whisper models on local hardware. It leverages modern optimizations such as batch processing, mixed precision, and advanced attention mechanisms like Flash Attention to significantly reduce inference time while maintaining high transcription accuracy. The project is built on top of the Transformers ecosystem and integrates with libraries such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Amical

    Amical

    Open Source AI Dictation App

    Amical is an open source, AI-powered desktop dictation and note-taking application that enables users to dictate hands-free, transcribe meetings, and capture notes effortlessly with unmatched speed, accuracy, and privacy. It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the platform. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 18
    ShortGPT

    ShortGPT

    AI framework for automated short video creation and editing tools

    ShortGPT is an experimental AI-powered framework designed to automate the creation of short-form and long-form video content. It provides a structured system that handles multiple stages of the content creation workflow, including script generation, asset sourcing, voiceover synthesis, and video editing. ShortGPT uses large language models to generate scripts and prompts that guide the automated editing and production process. ShortGPT includes specialized content engines that manage...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Buzz

    Buzz

    Transcribe and translate audio offline on your personal computer

    Buzz transcribes and translates audio to text offline using OpenAI's Whisper. Import audio and video files into Buzz and export them as TXT, SRT, or VTT files. Buzz supports Whisper, Whisper.cpp, Faster Whisper, Whisper-compatible models from the Hugging Face repository, and the OpenAI Whisper API. Get linux versions from: - https://flathub.org/apps/io.github.chidiwilliams.Buzz - https://snapcraft.io/buzz Home page of Buzz https://github.com/chidiwilliams/buzz Note for...
    Leader badge
    Downloads: 4,913 This Week
    Last Update:
    See Project
  • 20
    Whisper Batch Transcriber

    Whisper Batch Transcriber

    Unlimited, private and free Speech-To-Text program

    ## About: Automatically transcribe all of your voice recordings into clean, organized, neat text files. It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 21
    DeepSeek AIO

    DeepSeek AIO

    Access and use all DeepSeek AI models in one program.

    DeepSeek AIO is a simple program that allows you to interact with all DeepSeek large language models in one place. It supports text-based chats, data analysis, code generation, language translation, and more. The program is designed to make it easy for users to use DeepSeek's AI tools for different purposes without switching between multiple platforms.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 22

    Whisper-Transcriber-Tool

    Desktop application that converts video files into accurate text

    WhisperTranscriber is a free, powerful desktop application that converts video files into accurate text using OpenAI's Whisper AI model. Perfect for journalists, researchers, students, content creators, and anyone who needs reliable transcription. KEY FEATURES: - High-accuracy AI transcription with 99+ language support - Works completely offline - no internet required, total privacy - Supports all common video formats (mp4) - Batch processing for multiple files - Automatic language detection - Drag & drop interface - Export as SRT formats - No file size limits PORTABLE VERSION: - No installation needed - Run from USB or any folder - FFmpeg and AI models included - Lightweight and fast WHY CHOOSE WHISPERTRANSCRIBER: ✓ 100% free forever - no subscriptions or hidden costs ✓ Complete privacy - all processing happens on your computer ✓ No account or registration required ✓ Professional-grade accuracy ✓ Works offline
    Downloads: 29 This Week
    Last Update:
    See Project
  • 23
    Morse Key Express

    Morse Key Express

    Converts text to Morse sounds

    This is a simple application that converts text to Morse code and audio. It has minimal dependencies and is very lightweight. It does not yet offer the ability to transcribe Morse code and save the output to an audio file. It's a useful tool for amateur radio operators and radio operators. https://github.com/shampuan/morse-key-express
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common text files. For supported audio and video files, AI File Sorter can read embedded metadata (such as ID3, Vorbis, and MP4 tags) to suggest normalized names like year_artist_album_title.ext. AI analysis runs read-only, and all suggestions must be reviewed before being applied. ...
    Downloads: 247 This Week
    Last Update:
    See Project
  • 25
    The Hear

    The Hear

    The Hear program is made for journalists.

    To transcribe audio, the app uses the built-in speech recognition features of macOS. Turn your audio and video files into text You can change the font size by pressing the command and +/- keys. The font size is saved during further use. A folder "Hear" with text files is created on the desktop. The program is universal - arm64/x86_64 You can ask questions here https://sourceforge.net/p/the-hear/discussion
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB