/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files free download

Mooncake

Mooncake is the serving platform for Kimi

Mooncake is an open-source infrastructure platform designed to optimize large language model serving by focusing on efficient management and transfer of model data and KV cache. The platform was originally developed as part of the serving infrastructure for the Kimi large language model system. Its architecture centers on a high-performance transfer engine that provides unified data transfer across different storage and networking technologies. This engine enables efficient movement of tensors and model data across heterogeneous environments such as GPU memory, system memory, and distributed storage systems. ...

Downloads: 7 This Week

Last Update: 2026-04-01

See Project

nndeploy

An Easy-to-Use and High-Performance AI Deployment Framework

nndeploy is an open-source framework designed to simplify the deployment of artificial intelligence models across multiple hardware platforms and devices. The framework focuses on making it easier to transform trained AI models into production-ready applications that can run efficiently on desktops, mobile devices, servers, and edge computing hardware. Developers can use visual workflows to design and configure AI processing pipelines by connecting modular nodes that represent different...

Downloads: 4 This Week

Last Update: 2026-04-04

See Project

llamafile

Distribute and run LLMs with a single file

...The easiest way to try it for yourself is to download our example llamafile for the LLaVA model (license: LLaMA 2, OpenAI). LLaVA is a new LLM that can do more than just chat; you can also upload images and ask it questions about them. With llamafile, this all happens locally; no data ever leaves your computer.

Downloads: 39 This Week

Last Update: 2026-03-19

See Project

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

...Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 56 This Week

Last Update: 2026-04-11

See Project

mllm

Fast Multimodal LLM on Mobile Devices

mllm is an open-source inference engine designed to run multimodal large language models efficiently on mobile devices and edge computing environments. The framework focuses on delivering high-performance AI inference in resource-constrained systems such as smartphones, embedded hardware, and lightweight computing platforms. Implemented primarily in C and C++, it is designed to operate with minimal external dependencies while taking advantage of hardware-specific acceleration technologies...

Downloads: 1 This Week

Last Update: 2026-03-09

See Project

MyScaleDB

A @ClickHouse fork that supports high-performance vector search

...The system is built on top of the ClickHouse database engine and extends it with specialized indexing and search capabilities optimized for vector embeddings. This design allows developers to store structured data, unstructured text, and high-dimensional vector embeddings within a single database platform. MyScaleDB enables developers to perform vector similarity searches using standard SQL syntax, eliminating the need to learn specialized vector database query languages. The database is optimized for high performance and scalability, allowing it to handle extremely large datasets and high query loads typical of production AI applications.

Downloads: 0 This Week

Last Update: 2026-03-10

See Project

UCCL

UCCL is an efficient communication library for GPUs

UCCL is a high-performance GPU communication library designed to support distributed machine learning workloads and large-scale AI systems. The library focuses on enabling efficient data transfer and collective communication between GPUs during training and inference processes. It supports a variety of communication patterns including collective operations such as all-reduce as well as peer-to-peer transfers that are commonly used in modern machine learning architectures. UCCL is designed to work with heterogeneous hardware environments, allowing GPUs from different vendors and network interfaces to communicate efficiently without vendor lock-in. ...

Downloads: 0 This Week

Last Update: 2026-03-14

See Project

PowerInfer

High-speed Large Language Model Serving for Local Deployment

...This hybrid execution strategy significantly reduces memory bottlenecks and improves overall inference speed. PowerInfer incorporates specialized algorithms and sparse operators to manage neuron activation patterns and minimize data transfers between hardware components. As a result, it enables powerful language models to run on consumer hardware while achieving performance comparable to more expensive server-grade systems.

Downloads: 1 This Week

Last Update: 2026-03-04

See Project

RunAnywhere

Production ready toolkit to run AI locally

...The toolkit allows developers to integrate language models, speech recognition, and voice synthesis capabilities into mobile or desktop applications while keeping all computation local. By running models entirely on device, the platform eliminates network latency and protects user data because information does not leave the device. The SDK supports popular open-source models such as Llama, Mistral, and Qwen, enabling developers to build AI-powered features such as chat interfaces and voice assistants with minimal external dependencies. It also includes integrated pipelines that combine speech-to-text, large language models, and text-to-speech into a complete conversational system.

Downloads: 0 This Week

Last Update: 2026-03-20

See Project

Alpaca.cpp

Locally run an Instruction-Tuned Chat-Style LLM

Run a fast ChatGPT-like model locally on your device. This combines the LLaMA foundation model with an open reproduction of Stanford Alpaca a fine-tuning of the base model to obey instructions (akin to the RLHF used to train ChatGPT) and a set of modifications to llama.cpp to add a chat interface. Download the zip file corresponding to your operating system from the latest release. The weights are based on the published fine-tunes from alpaca-lora, converted back into a PyTorch checkpoint...

1 Review

Downloads: 3 This Week

Last Update: 2023-03-24

See Project

DomE

Implements a reference architecture for creating information systems

...The architecture comprises elements that guarantee user access through automatically generated interfaces for various devices, integration with external information sources, data and operations security, automatic generation of analytical information, and automatic control of business processes. All these features are generated from the domain model, which is, in turn, continuously evolved from interactions with the user or autonomously by the system itself. Thus, an alternative to the traditional software production processes is proposed, which involves several stages and different actors, sometimes demanding a lot of time and money without obtaining the expected result. ...

Downloads: 0 This Week

Last Update: 2023-03-22

See Project

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

Showing 11 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

Mooncake

nndeploy

llamafile

tt-metal

mllm

MyScaleDB

UCCL

PowerInfer

RunAnywhere

Alpaca.cpp

DomE

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

Showing 11 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

Mooncake

nndeploy

llamafile

tt-metal

mllm

MyScaleDB

UCCL

PowerInfer

RunAnywhere

Alpaca.cpp

DomE

Related Searches

Related Categories