Python ETL framework for stream processing, real-time analytics, LLM
Benchmarking synthetic data generation methods
Training data (data labeling, annotation, workflow) for all data types
Python module that helps you build complex pipelines of batch jobs
An open source multi-tool for exploring and publishing data
Always know what to expect from your data
The open standard for data logging
matplotlib: plotting with Python
AI-data warehouse to enrich, transform and analyze unstructured data
Open-source data observability for analytics engineers
Build, run, and manage data pipelines for integrating data
airda(Air Data Agent
Parallel computing with task scheduling
Repository for the Astropy core package
AutoGluon: AutoML for Image, Text, and Tabular Data
WebGL-based viewer for volumetric data
The toolkit to test, validate, and evaluate your models and surface
A cross-platform installer for the Julia programming language
Detecting silent model failure. NannyML estimates performance
Data science on data without acquiring a copy
The open-source tool for building high-quality datasets
Monitor the stability of a Pandas or Spark dataframe
Visualize and compare datasets, target values and associations
Recap tracks and transform schemas across your whole application
Build beautiful web-based analytic apps, no JavaScript required