Analyze computation-communication overlap in V3/R1
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Industrial-level controllable zero-shot text-to-speech system
A Powerful Native Multimodal Model for Image Generation
Generating Immersive, Explorable, and Interactive 3D Worlds
Language modeling in a sentence representation space
CLIP, Predict the most relevant text snippet given an image
Implementation of "MobileCLIP" CVPR 2024
Controllable & emotion-expressive zero-shot TTS
Revolutionizing Database Interactions with Private LLM Technology
HY-Motion model for 3D character animation generation
Sharp Monocular Metric Depth in Less Than a Second
ICLR2024 Spotlight: curation/training code, metadata, distribution
Powerful AI language model (MoE) optimized for efficiency/performance
Open-source, high-performance AI model with advanced reasoning
Lets make video diffusion practical
State-of-the-art TTS model under 25MB
Easy Docker setup for Stable Diffusion with user-friendly UI
Pretrained time-series foundation model developed by Google Research
Wan2.2: Open and Advanced Large-Scale Video Generative Model
The Clay Foundation Model - An open source AI model and interface
From Vibe Coding to Agentic Engineering
Code for running inference and finetuning with SAM 3 model
Uncommon Objects in 3D dataset
Open-source deep-learning framework