sandbox:/mnt/data/project_plan.pod free download

Diffgram

Training data (data labeling, annotation, workflow) for all data types

From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality.

Downloads: 2 This Week

Last Update: 2024-10-14

See Project

Omnilingual ASR

Omnilingual ASR Open-Source Multilingual SpeechRecognition

Omnilingual-ASR is a research codebase exploring automatic speech recognition that generalizes across a very large number of languages using shared modeling and training recipes. It focuses on leveraging self-supervised audio pretraining and scalable fine-tuning so low-resource languages can benefit from high-resource data. The project provides data preparation pipelines, training scripts, decoding utilities, and evaluation tools so researchers can reproduce results and extend to new language sets. It emphasizes modularity: acoustic modeling, language modeling, tokenization, and decoding are separable pieces you can swap or ablate. The repo is aimed at pushing practical multilingual ASR—robust to accents, code-switching, and domain shifts—rather than language-by-language systems. ...

Downloads: 1 This Week

Last Update: 2025-12-12

See Project

Kaldi

kaldi-asr/kaldi is the official location of the Kaldi project

...Kaldi is designed for researchers who need a highly customizable environment to experiment with new algorithms, as well as for practitioners who want robust, production-ready ASR pipelines. It includes extensive tools for data preparation, feature extraction, acoustic and language modeling, decoding, and evaluation. With its modular design, Kaldi allows users to adapt the system to a wide range of languages and domains. As one of the most influential projects in speech recognition, it has become a foundation for much of the modern work in ASR.

Downloads: 3 This Week

Last Update: 3 days ago

See Project

NVIDIA NeMo

Toolkit for conversational AI

...NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. ...

Downloads: 3 This Week

Last Update: 2026-03-23

See Project

SpeechRecognition

Speech recognition module for Python

Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. ...

Downloads: 4 This Week

Last Update: 2026-04-05

See Project

WhisperJAV

A subtitle generator for Japanese Adult Videos.

...Transformer-based ASR architectures like Whisper suffer significant performance degradation when applied to the spontaneous and noisy domain of JAV. This degradation is driven by specific acoustic and temporal characteristics that defy the statistical distributions of standard training data.

1 Review

Downloads: 58 This Week

Last Update: 7 days ago

See Project

Tensor2Tensor

Library of deep learning models and datasets

Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...

Downloads: 0 This Week

Last Update: 2021-05-24

See Project

Tensorpack

A Neural Net Training Interface on TensorFlow, with focus on speed

...Uses TensorFlow in the efficient way with no extra overhead. On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not offer the data processing flexibility needed in research. Tensorpack squeezes the most performance out of pure Python with various auto parallelization strategies. ...

Downloads: 0 This Week

Last Update: 2022-08-01

See Project

Search Results for "sandbox:/mnt/data/project_plan.pod"

Showing 8 open source projects for "sandbox:/mnt/data/project_plan.pod"

Diffgram

Omnilingual ASR

Kaldi

NVIDIA NeMo

SpeechRecognition

WhisperJAV

Tensor2Tensor

Tensorpack

Search Results for "sandbox:/mnt/data/project_plan.pod"

Showing 8 open source projects for "sandbox:/mnt/data/project_plan.pod"

Diffgram

Omnilingual ASR

Kaldi

NVIDIA NeMo

SpeechRecognition

WhisperJAV

Tensor2Tensor

Tensorpack

Related Searches

Related Categories