OCR expert VLM powered by Hunyuan's native multimodal architecture
Release for Improved Denoising Diffusion Probabilistic Models
DeepSeek LLM: Let there be answers
Open source large language model by Alibaba
Open Multilingual Multimodal Chat LMs
Encoder of greater-than-word length text trained on a variety of data
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Software that can generate photos from paintings
Official repo for consistency models
800,000 step-level correctness labels on LLM solutions to MATH problem
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Repo for external large-scale work
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Locally run an Instruction-Tuned Chat-Style LLM
Learning to Act by Watching Unlabeled Online Videos
PyTorch implementation of MAE
An implementation of model parallel GPT-2 and GPT-3-style models
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Large-scale autoregressive pixel model for image generation by OpenAI
Generate embeddings from large-scale graph-structured data
A library for Multilingual Unsupervised or Supervised word Embeddings
Dual LSTM Encoder for Dialog Response Generation
CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B
Vision-language-action model for robot control via images and text
Tiny pre-trained IBM model for multivariate time series forecasting