High-level training, data augmentation, and utilities for Pytorch
AutoGluon: AutoML for Image, Text, and Tabular Data
The agent that grows with you
Python-based neural networks API
Personal AI, On Personal Devices
Official inference repo for FLUX.1 models
Python inference and LoRA trainer package for the LTX-2 audio–video
Stable Diffusion web UI
Offline Text To Speech synthesis for python
OCRmyPDF adds an OCR text layer to scanned PDF files
Video-based AI memory library. Store millions of text chunks in MP4
Robust Speech Recognition via Large-Scale Weak Supervision
A high-throughput and memory-efficient inference and serving engine
Public repository for Agent Skills
Image polygonal annotation with Python
1 min voice data can also be used to train a good TTS model
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
An Efficient Web-enhanced Question Answering System
An LLM Compiler for Parallel Function Calling
Source code of PyGAD, Python 3 library for building genetic algorithms
Reverse-engineered Python API for Google Gemini web app
Awesome multilingual OCR toolkits based on PaddlePaddle
The most powerful and modular diffusion model GUI, api and backend
3D reconstruction software
A simple, high-quality voice conversion tool focused on ease of use