Showing 884 open source projects for "sandbox:/mnt/data/project_plan.pod"

View related business solutions
  • No-code automation to improve your process workflows Icon
    No-code automation to improve your process workflows

    Pipefy is a digital automation software that centralizes data and standardizes workflows for teams like Finance and HR

    Transform your financial and HR operations and improve efficiency even remotely with digital, customized workflows that your team can automate and integrate with other software without the need of IT development.
    Try For Free
  • Windocks - Docker Oracle and SQL Server Containers Icon
    Windocks - Docker Oracle and SQL Server Containers

    Deliver faster. Provision data for AI/ML. Enhance data privacy. Improve quality.

    Windocks is a leader in cloud native database DevOps, recognized by Gartner as a Cool Vendor, and as an innovator by Bloor research in Test Data Management. Novartis, DriveTime, American Family Insurance, and other enterprises rely on Windocks for on-demand database environments for development, testing, and DevOps. Windocks software is easily downloaded for evaluation on standard Linux and Windows servers, for use on-premises or cloud, and for data delivery of SQL Server, Oracle, PostgreSQL, and MySQL to Docker containers or conventional database instances.
    Learn More
  • 1
    Ubix Linux

    Ubix Linux

    The Pocket Datalab

    Ubix stands for Universal Business Intelligence Computing System. Ubix Linux is an open-source, Debian-based Linux distribution geared towards data acquisition, transformation, analysis and presentation. Ubix Linux purpose is to offer a tiny but versatile datalab. Ubix Linux is easily accessible, resource-efficient and completely portable on a simple USB key. Ubix Linux is a perfect toolset for learning data analysis and artificial intelligence basics on small to medium datasets. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    EmotiVoice

    EmotiVoice

    Multi-Voice and Prompt-Controlled TTS Engine

    ...EmotiVoice provides multiple ways to interact with it, including a web interface, a Docker image, an HTTP API (including an OpenAI-compatible TTS API), and Python scripts for batch synthesis. It also supports voice cloning with your own data, backed by recipes for popular datasets like DataBaker and LJSpeech, so you can train or adapt voices to custom personas.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LLaMA-MoE

    LLaMA-MoE

    Building Mixture-of-Experts from LLaMA with Continual Pre-training

    LLaMA-MoE is an open-source project that builds mixture-of-experts language models from LLaMA through expert partitioning and continual pre-training. The repository is centered on making MoE research more accessible by offering smaller and more affordable models with only about 3.0 to 3.5 billion activated parameters, which helps reduce deployment and experimentation costs. Its architecture works by splitting LLaMA feed-forward networks into sparse experts and adding gating mechanisms so...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Curtain LogTrace File Activity Monitoring Icon
    Curtain LogTrace File Activity Monitoring

    For any organizations (up to 10,000 PCs)

    Curtain LogTrace File Activity Monitoring is an enterprise file activity monitoring solution. It tracks user actions: create, copy, move, delete, rename, print, open, close, save. Includes source/destination paths and disk type. Perfect for monitoring user file activities.
    Learn More
  • 5
    DB-GPT-Hub

    DB-GPT-Hub

    A repository that contains models, datasets, and fine-tuning

    ...The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning, benchmarking, and inference for Text-to-SQL systems. The repository includes datasets and experiment configurations that allow researchers to train models on real database schemas and evaluate them using standardized benchmarks. Its design encourages experimentation with different large language models and fine-tuning techniques, including parameter-efficient training approaches.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Alpaca-CoT

    Alpaca-CoT

    We unified the interfaces of instruction-tuning data

    ...The repository includes datasets, training scripts, and examples demonstrating how chain-of-thought data can be used to fine-tune language models. It also explores how reasoning traces generated by larger models can be distilled into smaller models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Adala

    Adala

    Adala: Autonomous DAta (Labeling) Agent framework

    Adala is a data-centric AI framework focused on dataset curation, annotation, and validation. It helps AI teams manage high-quality training datasets by providing tools for data auditing, error detection, and quality assessment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    CLIP-as-service

    CLIP-as-service

    Embed images and sentences into fixed-length vectors

    ...It can be easily integrated as a microservice into neural search solutions. Serve CLIP models with TensorRT, ONNX runtime and PyTorch w/o JIT with 800QPS[*]. Non-blocking duplex streaming on requests and responses, designed for large data and long-running tasks. Horizontally scale up and down multiple CLIP models on single GPU, with automatic load balancing. Easy-to-use. No learning curve, minimalist design on client and server. Intuitive and consistent API for image and sentence embedding. Async client support. Easily switch between gRPC, HTTP, WebSocket protocols with TLS and compression. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    RAGs

    RAGs

    Build ChatGPT over your data, all with natural language

    ...Users can also inspect and adjust parameters such as the number of retrieved documents, summarization strategies, and query settings through a configuration interface. Once the pipeline is created, the system enables conversational queries over the connected data sources, effectively creating a personalized knowledge assistant.
    Downloads: 0 This Week
    Last Update:
    See Project
  • A warehouse and inventory management software that scales with your business. Icon
    A warehouse and inventory management software that scales with your business.

    For leading 3PLs and high-volume brands searching for an advanced WMS

    Logiwa is a leader in cloud-native fulfillment technology, revolutionizing high-volume fulfillment for third-party logistics (3PLs), B2B and B2C fulfillment networks, and direct-to-consumer brands. Our flagship product, Logiwa IO, is an advanced Fulfillment Management System (FMS) designed to scale operations in the digital era. Logiwa elevates digital warehousing to new heights, ensuring dynamic and efficient fulfillment processes. Our commitment to AI-driven technology, combined with a focus on customer-centricity, equips businesses to adeptly navigate and excel in rapidly changing market landscapes. Discover the future of smart fulfillment and how you can fulfill brilliantly with Logiwa IO.
    Learn More
  • 10
    solo-learn

    solo-learn

    Library of self-supervised methods for visual representation

    A library of self-supervised methods for visual representation learning powered by Pytorch Lightning. A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. The library is self-contained, but it is possible to use the models outside of solo-learn.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    YiVal

    YiVal

    Your Automatic Prompt Engineering Assistant for GenAI Applications

    ...It focuses on experimentation and optimization by allowing users to test multiple prompt variations, configurations, and model parameters in parallel, then evaluate their outputs using structured metrics and scoring systems. The platform is particularly useful in production environments where prompt quality directly impacts user experience, as it provides a repeatable and data-driven approach to refining prompts rather than relying on manual trial and error. YiVal supports integration with various LLM providers and can orchestrate experiments across different models, making it adaptable to evolving AI ecosystems. It also includes evaluation pipelines that help quantify output quality based on criteria such as accuracy, coherence, or task-specific benchmarks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    DPM-Solver

    DPM-Solver

    Fast ODE Solver for Diffusion Probabilistic Model Sampling

    DPM-Solver is a machine learning research implementation focused on accelerating the sampling process in diffusion probabilistic models used for generative AI tasks. Diffusion models are powerful generative systems capable of producing high-quality images and other data, but traditional sampling methods often require hundreds or thousands of computational steps. The project introduces a specialized numerical solver designed to approximate the diffusion process using a small number of high-order integration steps. By reformulating the sampling problem as the solution of a diffusion-related ordinary differential equation, the solver can produce high-quality samples much more efficiently. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OpenAssistant

    OpenAssistant

    Chat-based assistant that understands tasks

    ...You do not need to run the project locally unless you are contributing to the development process. The website link above will take you to the public website where you can use the data collection app.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the container is deployed. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    TextGen

    TextGen

    textgen, Text Generation models

    Implementation of Text Generation models. textgen implements a variety of text generation models, including UDA, GPT2, Seq2Seq, BART, T5, SongNet and other models, out of the box. UDA, non-core word replacement. EDA, simple data augmentation technique: similar words, synonym replacement, random word insertion, deletion, replacement. This project refers to Google's UDA (non-core word replacement) algorithm and EDA algorithm, based on TF-IDF to replace some unimportant words in sentences with synonyms, random word insertion, deletion, replacement, etc. method, generating new text and implementing text augmentation This project realizes the back translation function based on Baidu translation API, first translate Chinese sentences into English, and then translate English into new Chinese. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Autolabel

    Autolabel

    Label, clean and enrich text datasets with LLMs

    Autolabel is a Python library to label, clean and enrich datasets with Large Language Models (LLMs). Autolabel data for NLP tasks such as classification, question-answering and named entity recognition, entity matching and more. Seamlessly use commercial and open-source LLMs from providers such as OpenAI, Anthropic, HuggingFace, Google and more.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Chinese Llama 2 7B

    Chinese Llama 2 7B

    The first Chinese LLaMA2 model in the open source community

    Chinese Llama 2 7B is an open-source large language model adapted from the LLaMA-2 architecture and optimized for Chinese and bilingual Chinese-English applications. The project provides a version of LLaMA-2 that has been further trained on Chinese data so it can better understand and generate text in Chinese while maintaining compatibility with the original model ecosystem. In addition to the model weights, the repository also includes supervised fine-tuning datasets and training resources that help developers build chat-optimized versions of the model. The project follows the input format used by the LLaMA-2 chat architecture, ensuring compatibility with existing optimization techniques and tools built for the LLaMA-2 ecosystem. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Asteroid

    Asteroid

    The PyTorch-based audio source separation toolkit for researchers

    ...Extending the toolkit with new features is simple. Add a new filterbank, separator architecture, dataset or even recipe very easily. Recipes provide an easy way to reproduce results with data preparation, system design, training and evaluation in a single script. This is an essential tool for the community! The default logger is TensorBoard in all the recipes. From the recipe folder, you can run the following to visualize the logs of all your runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    State of Open Source AI

    State of Open Source AI

    Clarity in the current fast-paced mess of Open Source innovation

    This repository is the source for a book (or large written work) titled “The State of Open Source AI”. The goal of the project is to bring clarity to the rapidly evolving open-source AI ecosystem by documenting trends, models, tools, standards, deployment practices, and challenges. It acts as both a snapshot and a guide: readers can see what’s “hot now” in open AI infrastructure, what open licensing or governance issues are emerging, how deployment options compare, and what gaps remain....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    fastquant

    fastquant

    Backtest and optimize your ML trading strategies with only 3 lines

    ...The project focuses on making backtesting accessible by providing a high-level interface that allows users to test investment strategies with only a few lines of code. It integrates historical market data sources and trading frameworks so that users can quickly build experiments without constructing complex data pipelines. The framework enables users to test common strategies such as moving average crossovers, momentum trading, and custom indicators on historical stock data. By automating data retrieval, strategy evaluation, and result visualization, the library reduces the barrier to entry for individuals interested in quantitative finance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ChatFred

    ChatFred

    Alfred workflow using ChatGPT, DALL·E 2 and other models for chatting

    .... ⤓ Install on the Alfred Gallery or download it over GitHub and add your OpenAI API key. If you have used ChatGPT or DALL·E 2, you already have an OpenAI account. Otherwise, you can sign up here - You will receive $5 in free credit, no payment data is required. Afterward you can create your API key. To start a conversation with ChatGPT either use the keyword cf, setup the workflow as a fallback search in Alfred or create your custom hotkey to directly send the clipboard content to ChatGPT.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyTorch Implementation of SDE Solvers

    PyTorch Implementation of SDE Solvers

    Differentiable SDE solvers with GPU support and efficient sensitivity

    This library provides stochastic differential equation (SDE) solvers with GPU support and efficient backpropagation. examples/demo.ipynb gives a short guide on how to solve SDEs, including subtle points such as fixing the randomness in the solver and the choice of noise types. examples/latent_sde.py learns a latent stochastic differential equation, as in Section 5 of [1]. The example fits an SDE to data, whilst regularizing it to be like an Ornstein-Uhlenbeck prior process. The model can be loosely viewed as a variational autoencoder with its prior and approximate posterior being SDEs. The program outputs figures to the path specified by <TRAIN_DIR>. Training should stabilize after 500 iterations with the default hyperparameters. examples/sde_gan.py learns an SDE as a GAN, as in [2], [3]. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    xTuring

    xTuring

    Easily build, customize and control your own LLMs

    xTuring is an open-source AI personalization software. xTuring makes it easy to build and control LLMs by providing a simple interface to personalize LLMs to your own data and application. xTuring provides fast, efficient and simple fine-tuning of LLMs, such as LLaMA, GPT-J, Galactica, and more. By providing an easy-to-use interface for fine-tuning LLMs to your own data and application, xTuring makes it simple to build, customize and control LLMs. The entire process can be done inside your computer or in your private cloud, ensuring data privacy and security.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Lightning-Hydra-Template

    Lightning-Hydra-Template

    PyTorch Lightning + Hydra. A very user-friendly template

    ...A collection of best practices for efficient workflow and reproducibility. Thoroughly commented - you can use this repo as a reference and educational resource. Not fitted for data engineering - the template configuration setup is not designed for building data processing pipelines that depend on each other. PyTorch Lightning, a lightweight PyTorch wrapper for high-performance AI research. Think of it as a framework for organizing your PyTorch code. Hydra, a framework for elegantly configuring complex applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB