Foundational Models for State-of-the-Art Speech and Text Translation
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Advancing Formal Mathematical Reasoning via Reinforcement Learning
FlashMLA: Efficient Multi-head Latent Attention Kernels
StudioOllamaUI is a local, portable interface for Ollama
Release for Improved Denoising Diffusion Probabilistic Models
DeepSeek LLM: Let there be answers
Open source large language model by Alibaba
AI Suite for upscaling, interpolating & restoring images/videos
AI-powered tool to quickly remove watermarks from images flawlessly
Open Multilingual Multimodal Chat LMs
Encoder of greater-than-word length text trained on a variety of data
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Software that can generate photos from paintings
Official repo for consistency models
800,000 step-level correctness labels on LLM solutions to MATH problem
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Repo for external large-scale work
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Locally run an Instruction-Tuned Chat-Style LLM
Learning to Act by Watching Unlabeled Online Videos
PyTorch implementation of MAE
An implementation of model parallel GPT-2 and GPT-3-style models
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)