Large Multimodal Models for Video Understanding and Editing
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based Reinforcement Learning audio edit model
Open-weight, large-scale hybrid-attention reasoning model
Large-language-model & vision-language-model based on Linear Attention
Capable of understanding text, audio, vision, video
Chat & pretrained large vision language model
Pushing the Limits of Mathematical Reasoning in Open Language Models
Chat & pretrained large audio language model proposed by Alibaba Cloud
Release for Improved Denoising Diffusion Probabilistic Models
Official DeiT repository
Real-time behaviour synthesis with MuJoCo, using Predictive Control
High-Resolution Image Synthesis with Latent Diffusion Models
StudioOllamaUI is a local, portable interface for Ollama
Open-source, high-performance Mixture-of-Experts large language model
A Conversational Speech Generation Model
AI-powered tool to quickly remove watermarks from images flawlessly
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
AI Suite for upscaling, interpolating & restoring images/videos
Powerful open source image generation model
Example Discord bot written in Python that uses the completions API
Open Multilingual Multimodal Chat LMs
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Towards Ultimate Expert Specialization in Mixture-of-Experts Language