Page 2 | sandbox:/mnt/data/project_plan.pod free download

Evidently

Evaluate and monitor ML models from validation to production

Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor ML models from validation to production. It works with tabular, text data and embeddings.

Downloads: 5 This Week

Last Update: 2026-03-10

See Project

Zero to Mastery Machine Learning

All course materials for the Zero to Mastery Machine Learning

Zero to Mastery Machine Learning is an open-source repository that contains the complete course materials for the Zero to Mastery Machine Learning and Data Science bootcamp. The project provides a structured curriculum designed to teach machine learning and data science using Python through hands-on projects and interactive notebooks. The repository includes datasets, Jupyter notebooks, documentation, and example code that walk learners through the entire machine learning workflow from problem definition to model deployment. ...

Downloads: 7 This Week

Last Update: 2026-03-11

See Project

Bytewax

Python Stream Processing

...You can use Bytewax for a variety of workloads from moving data à la Kafka Connect style all the way to advanced online machine learning workloads. Bytewax is not limited to streaming applications but excels anywhere that data can be distributed at the input and output.

Downloads: 2 This Week

Last Update: 2024-11-25

See Project

PySyft

Data science on data without acquiring a copy

...Wherever your data wants to live in your ownership, the Syft ecosystem exists to help keep it there while allowing it to be used privately.

Downloads: 1 This Week

Last Update: 2025-02-13

See Project

FiftyOne

The open-source tool for building high-quality datasets

...FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more! Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way.

Downloads: 1 This Week

Last Update: 2026-04-06

See Project

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks.

Downloads: 1 This Week

Last Update: 2026-02-19

See Project

Awesome Fraud Detection Research Papers

A curated list of data mining papers about fraud detection

A curated list of data mining papers about fraud detection from several conferences.

Downloads: 0 This Week

Last Update: 2026-01-05

See Project

RAGFlow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine

...It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.

Downloads: 3 This Week

Last Update: 2026-02-10

See Project

Spice.ai OSS

A self-hostable CDN for databases

Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB. ...

Downloads: 3 This Week

Last Update: 2026-04-10

See Project

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic). ...

Downloads: 0 This Week

Last Update: 4 days ago

See Project

dlib

Toolkit for making machine learning and data analysis applications

Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge. Good unit test coverage, the ratio of unit test lines of code to library lines of code is...

Downloads: 4 This Week

Last Update: 2026-03-29

See Project

Python Programming Hub

Learn Python and Machine Learning from scratch

...The repository emphasizes hands-on learning by demonstrating real programming tasks such as data manipulation, statistical analysis, visualization, and automation. It also includes examples of commonly used libraries such as NumPy, Pandas, and other tools used in data science workflows.

Downloads: 1 This Week

Last Update: 2026-03-12

See Project

MNE-Python

Magnetoencephalography (MEG) and Electroencephalography EEG in Python

Open-source Python package for exploring, visualizing, and analyzing human neurophysiological data. MNE-Python is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, EEG, sEEG, ECoG, and more. It includes modules for data input/output, preprocessing, visualization, source estimation, time-frequency analysis, connectivity analysis, machine learning, statistics, and more.

Downloads: 2 This Week

Last Update: 7 hours ago

See Project

BetaML.jl

Beta Machine Learning Toolkit

The Beta Machine Learning Toolkit is a package including many algorithms and utilities to implement machine learning workflows in Julia, Python, R and any other language with a Julia binding. All models are implemented entirely in Julia and are hosted in the repository itself (i.e. they are not wrapper to third-party models). If your favorite option or model is missing, you can try to implement it yourself and open a pull request to share it (see the section Contribute below) or request its...

Downloads: 2 This Week

Last Update: 2025-10-30

See Project

C3

The goal of CLAIMED is to enable low-code/no-code rapid prototyping

C3 is an open-source framework designed to simplify the development and deployment of data science and machine learning workflows through reusable components and low-code development techniques. The framework focuses on enabling rapid prototyping while maintaining a path to production through automated CI/CD integration. CLAIMED provides a component-based architecture where data processing steps, models, and workflows can be packaged into reusable operators.

Downloads: 4 This Week

Last Update: 2026-04-13

See Project

Open Notebook

An Open Source implementation of Notebook LM with more flexibility

Open Notebook is an open-source, privacy-focused alternative to Google’s Notebook LM that gives users full control over their research and AI workflows. Designed to be self-hosted, it ensures complete data sovereignty by keeping your content local or within your own infrastructure. The platform supports 16+ AI providers—including OpenAI, Anthropic, Ollama, Google, and LM Studio—allowing flexible model choice and cost optimization. Open Notebook enables users to organize and analyze multi-modal content such as PDFs, videos, audio files, web pages, and Office documents. ...

Downloads: 33 This Week

Last Update: 1 day ago

See Project

DataDrivenDiffEq.jl

Data driven modeling and automated discovery of dynamical systems

DataDrivenDiffEq.jl is a package for finding systems of equations automatically from a dataset. The methods in this package take in data and return the model which generated the data. A known model is not required as input. These methods can estimate equation-free and equation-based models for discrete, continuous differential equations or direct mappings.

Downloads: 2 This Week

Last Update: 2026-01-08

See Project

mosaicml composer

Supercharge Your Model Training

composer is a deep learning training framework built on PyTorch and designed to make large-scale model training more efficient, scalable, and customizable. At the center of the project is a highly optimized Trainer abstraction that simplifies the management of training loops, parallelization, metrics, logging, and data loading. The framework is intended for modern workloads that may span anything from a single GPU to very large distributed training environments, which makes it suitable for both experimentation and production-scale development. It includes built-in support for distributed training strategies such as Fully Sharded Data Parallelism and standard Distributed Data Parallel execution, helping teams scale models without having to assemble as much infrastructure by hand.

Downloads: 1 This Week

Last Update: 2026-03-10

See Project

AutoViz

Automatically Visualize any dataset, any size

AutoViz is a Python data visualization library designed to automate exploratory data analysis by generating multiple visualizations with minimal code. The primary goal of the project is to help data scientists and analysts quickly understand patterns, relationships, and anomalies within datasets without manually writing complex plotting code. With a single command, the library can automatically generate dozens of charts and graphs that reveal insights into the structure and quality of the data.

Downloads: 0 This Week

Last Update: 2026-03-12

See Project

TorchRL

A modular, primitive-first, python-first PyTorch library

TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.

Downloads: 61 This Week

Last Update: 2026-02-05

See Project

Lance

Modern columnar data format for ML and LLMs implemented in Rust

Lance is a columnar data format that is easy and fast to version, query and train on. It’s designed to be used with images, videos, 3D point clouds, audio and of course tabular data. It supports any POSIX file systems, and cloud storage like AWS S3 and Google Cloud Storage.

Downloads: 1 This Week

Last Update: 2026-03-30

See Project

Apache Hamilton

Helps data scientists define testable self-documenting dataflows

...This approach encourages modular, testable, and maintainable data pipelines because each transformation is isolated and easily unit tested. The framework also automatically tracks lineage and metadata about how data is produced, which improves debugging, reproducibility, and transparency in data workflows.

Downloads: 0 This Week

Last Update: 2026-03-12

See Project

SageMaker Training Toolkit

Train machine learning models within Docker containers

Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and reliable training process. ...

Downloads: 1 This Week

Last Update: 2025-09-22

See Project

DataFrame

C++ DataFrame for statistical, Financial, and ML analysis

This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily.

Downloads: 1 This Week

Last Update: 2026-04-03

See Project

MLJAR Studio

Python package for AutoML on Tabular Data with Feature Engineering

...All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. It is designed to save time for a data scientist. It abstracts the common way to preprocess the data, construct the machine learning models, and perform hyper-parameter tuning to find the best model. It is no black box, as you can see exactly how the ML pipeline is constructed (with a detailed Markdown report for each ML model).

Downloads: 2 This Week

Last Update: 2026-03-26

See Project

Search Results for "sandbox:/mnt/data/project_plan.pod" - Page 2

Showing 533 open source projects for "sandbox:/mnt/data/project_plan.pod"

Evidently

Zero to Mastery Machine Learning

Bytewax

PySyft

FiftyOne

DALI

Awesome Fraud Detection Research Papers

RAGFlow

Spice.ai OSS

Pandas Profiling

dlib

Python Programming Hub

MNE-Python

BetaML.jl

C3

Open Notebook

DataDrivenDiffEq.jl

mosaicml composer

AutoViz

TorchRL

Lance

Apache Hamilton

SageMaker Training Toolkit

DataFrame

MLJAR Studio

Search Results for "sandbox:/mnt/data/project_plan.pod" - Page 2

Showing 533 open source projects for "sandbox:/mnt/data/project_plan.pod"

Evidently

Zero to Mastery Machine Learning

Bytewax

PySyft

FiftyOne

DALI

Awesome Fraud Detection Research Papers

RAGFlow

Spice.ai OSS

Pandas Profiling

dlib

Python Programming Hub

MNE-Python

BetaML.jl

C3

Open Notebook

DataDrivenDiffEq.jl

mosaicml composer

AutoViz

TorchRL

Lance

Apache Hamilton

SageMaker Training Toolkit

DataFrame

MLJAR Studio

Related Searches

Related Categories