Production-tested AI infrastructure tools
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Open-source framework for intelligent speech interaction
Open-source deep-learning framework
Open-Source Financial Large Language Models
Official DeiT repository
Qwen2.5-VL is the multimodal large language model series
Strong, Economical, and Efficient Mixture-of-Experts Language Model
AlphaFold 3 inference pipeline
LLM-based Reinforcement Learning audio edit model
RGBD video generation model conditioned on camera input
Flux 2 image generation model pure C inference
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
PyTorch implementation of JiT
Scaling Reinforcement Learning with LLMs
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
A trainable PyTorch reproduction of AlphaFold 3
Reference PyTorch implementation and models for DINOv3
An experimental version of DeepSeek model
Foundation Models for Time Series
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Towards Real-World Vision-Language Understanding
Diffusion Bee is the easiest way to run Stable Diffusion locally
An AI-powered security review GitHub Action using Claude
The ChatGPT Retrieval Plugin lets you easily find personal documents