CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B
Vision-language-action model for robot control via images and text
Tiny pre-trained IBM model for multivariate time series forecasting
Metric monocular depth estimation (vision model)
CLIP model fine-tuned for zero-shot fashion product classification
VaultGemma: 1B DP-trained Gemma variant for private NLP tasks
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens