A Pythonic framework to simplify AI service building
Run local LLMs like llama, deepseek, kokoro etc. inside your browser
PyTorch library of curated Transformer models and their components
Opiniated RAG for integrating GenAI in your apps
Terminal-based LLM chat tool with multi-model and local support
Low code tool to rapidly build and coordinate multi-agent teams
Local CLI Copilot, powered by Ollama
Pruna is a model optimization framework built for developers
Cosmos-RL is a flexible and scalable Reinforcement Learning framework
Agents-Flex is an elegant LLM Application Framework like LangChain
Building Mixture-of-Experts from LLaMA with Continual Pre-training
Pre & Post-training & Dataset & Evaluation & Depoly & RAG
An MCP client for Neovim that seamlessly integrates MCP servers
Public CI, Docker images for popular JAX libraries
On the Structural Pruning of Large Language Models
Accelerate local LLM inference and finetuning
llama and other large language models on iOS and MacOS offline
Extract structured data from webpages using LLM-powered scraping
Traditional Mandarin LLMs for Taiwan
Unleashing 10,000+ Word Generation from Long Context LLMs
Run PyTorch LLMs locally on servers, desktop and mobile
Production ready toolkit to run AI locally
A JAX research toolkit to build, edit, & visualize neural networks
Tool for generating high quality Synthetic datasets
Private chat with local GPT with document, images, video, etc.