A fast TTS architecture with conditional flow matching
SOTA discrete acoustic codec models with 40/75 tokens per second
One-click deployment (including offline integration package)
Foundational model for human-like, expressive TTS
A TTS model capable of generating ultra-realistic dialogue
Pokee Deep Research Model Open Source Repo
DeepMind model for tracking arbitrary points across videos & robotics
Global weather forecasting model using graph neural networks and JAX
An alignment auditing agent capable of exploring alignment hypothesis
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
PyTorch code and models for VJEPA2 self-supervised learning from video
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Evals is a framework for evaluating LLMs and LLM systems
The ChatGPT Retrieval Plugin lets you easily find personal documents
Designed for text embedding and ranking tasks
Implementation of the Surya Foundation Model for Heliophysics
A modular high-level library to train embodied AI agents
Revolutionizes the way users interact with Autogen
LLM training code for MosaicML foundation models