Showing 224 open source projects for "git:/git.code.sf.net/p/docfetcher/code"

View related business solutions
  • Rev Your Digital Product Delivery Engine Icon
    Rev Your Digital Product Delivery Engine

    Enterprise-grade platform designed to connect strategy, planning, and execution across digital product development and software delivery

    Planview links your technology vision directly to teams' daily work, providing complete visibility and control over your digital product delivery ecosystem.
    Learn More
  • All-in-One Inspection Software Icon
    All-in-One Inspection Software

    flowdit is a connected worker platform tailored for industry needs in commissioning, quality, maintenance, and EHS management.

    Optimize Frontline Operations: Elevate Equipment Uptime, Operational Excellence, and Safety with Connected Teams and Data, Including Issue Capture and Corrective Action.
    Learn More
  • 1
    Watermark-Removal

    Watermark-Removal

    Machine learning image inpainting task that removes watermarks

    ...Through these techniques, the model learns to identify regions of the image affected by the watermark and generate realistic replacements for the missing visual information. The repository contains code for preprocessing images, training the model, and running inference on images to automatically remove watermark artifacts.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    ...Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document datasets, comparable with GoogleVision/AWS Textract. Easy integration (available templates for browser demo & API deployment). End-to-End OCR is achieved in docTR using a two-stage approach: text detection (localizing words), then text recognition (identify all characters in the word). ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Shapash

    Shapash

    Explainability and Interpretability to Develop Reliable ML models

    Shapash is a Python library dedicated to the interpretability of Data Science models. It provides several types of visualization that display explicit labels that everyone can understand. Data Scientists can more easily understand their models, share their results and easily document their projects in an HTML report. End users can understand the suggestion proposed by a model using a summary of the most influential criteria.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    RL with PyTorch

    RL with PyTorch

    Clean, Robust, and Unified PyTorch implementation

    ...The project focuses on helping developers and researchers understand reinforcement learning methods by providing clean and reproducible implementations of well-known algorithms. It includes code for popular deep reinforcement learning techniques such as Deep Q-Networks, policy gradient methods, actor-critic architectures, and other modern RL approaches. The repository is structured so that users can easily experiment with different algorithms and training environments. Many examples demonstrate how agents learn to interact with simulated environments through trial and error using reinforcement learning principles. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • The #1 AI-Powered eLearning Platform Icon
    The #1 AI-Powered eLearning Platform

    For users seeking a platform to generate online courses using AI

    Transform your content into engaging eLearning experiences with Coursebox, the #1 AI-powered eLearning authoring tool. Our platform automates the course creation process, allowing you to design a structured course in seconds. Simply make edits, add any missing elements, and your course is ready to go. Whether you want to publish privately, share publicly, sell your course, or export it to your LMS, Coursebox has you covered.
    Learn More
  • 5
    Data Science Articles from CodeCut

    Data Science Articles from CodeCut

    Collection of useful data science topics along with articles

    ...The materials address areas such as MLOps, data management, project organization, testing practices, visualization techniques, and productivity tools used by data scientists. Each topic often includes references to code repositories, demonstrations, and video tutorials that show how the tools can be applied in real projects. The repository is intended to help practitioners stay updated with current best practices and technologies in the field of data science.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Optax

    Optax

    Optax is a gradient processing and optimization library for JAX

    ...We favor focusing on small composable building blocks that can be effectively combined into custom solutions. Others may build upon these basic components in more complicated abstractions. Whenever reasonable, implementations prioritize readability and structuring code to match standard equations, over code reuse.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Stable Baselines3

    Stable Baselines3

    PyTorch version of Stable Baselines

    Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch. It is the next major version of Stable Baselines. You can read a detailed presentation of Stable Baselines3 in the v1.0 blog post or our JMLR paper. These algorithms will make it easier for the research community and industry to replicate, refine, and identify new ideas, and will create good baselines to build projects on top of. We expect these tools will be used as a base around...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Gradio

    Gradio

    Create UIs for your machine learning model in Python in 3 minutes

    Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere! Gradio can be installed with pip. Creating a Gradio interface only requires adding a couple lines of code to your project. You can choose from a variety of interface types to interface your function. Gradio can be embedded in Python notebooks or presented as a webpage. A Gradio interface can automatically generate a public link you can share with colleagues that lets them interact with the model on your computer remotely from their own devices. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    AI-Tutorials/Implementations Notebooks

    AI-Tutorials/Implementations Notebooks

    Codes/Notebooks for AI Projects

    AI-Tutorials/Implementations Notebooks repository is a comprehensive collection of artificial intelligence tutorials and implementation examples intended for developers, students, and researchers who want to learn by building practical AI projects. The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Self-hosted password manager Icon
    Self-hosted password manager

    Developed and headquartered in Europe (Barcelona, Spain), Passwork meets GDPR, NIS2, ENS and other European regulatory requirements by design.

    On-premise solution with double encryption and certified development processes for maximum protection of corporate data. Zero‑knowledge architecture ensures your passwords never leave your infrastructure.
    Learn More
  • 10
    python-small-examples

    python-small-examples

    Focus on creating classic Python small examples and cases

    python-small-examples is an open-source educational repository that contains hundreds of concise Python programming examples designed to illustrate practical coding techniques. The project focuses on teaching programming concepts through small, focused scripts that demonstrate common tasks in data processing, visualization, and general programming. Each example highlights a specific function or programming pattern so that learners can quickly understand how to apply Python features in...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    DeepSeed

    DeepSeed

    Deep learning optimization library making distributed training easy

    DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 13
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    ...You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and reliable training process. The SageMaker Training Toolkit can be easily added to any Docker container, making it compatible with SageMaker for training models. If you use a prebuilt SageMaker Docker image for training, this library may already be included. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    AutoViz

    AutoViz

    Automatically Visualize any dataset, any size

    AutoViz is a Python data visualization library designed to automate exploratory data analysis by generating multiple visualizations with minimal code. The primary goal of the project is to help data scientists and analysts quickly understand patterns, relationships, and anomalies within datasets without manually writing complex plotting code. With a single command, the library can automatically generate dozens of charts and graphs that reveal insights into the structure and quality of the data. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    machine learning tutorials

    machine learning tutorials

    machine learning tutorials (mainly in Python3)

    machine-learning is a continuously updated repository documenting the author’s learning journey through data science and machine learning topics using practical tutorials and experiments. The project presents educational notebooks that combine mathematical explanations with code implementations using Python’s scientific computing ecosystem. Topics covered include classical machine learning algorithms, deep learning models, reinforcement learning, model deployment, and time-series analysis. The repository integrates numerous popular machine learning frameworks and libraries such as scikit-learn, PyTorch, TensorFlow, XGBoost, and Hugging Face. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Kaggle Solutions

    Kaggle Solutions

    Collection of Kaggle Solutions and Ideas

    Kaggle Solutions is an open-source repository that compiles winning solutions, insights, and educational resources from hundreds of Kaggle data science competitions. The repository acts as a knowledge base for competitive machine learning by collecting solution write-ups, discussion threads, code notebooks, and tutorial resources shared by top Kaggle participants. Each competition entry typically includes information about the dataset, evaluation metrics, modeling strategies, and techniques used by high-ranking competitors. The repository also highlights important machine learning concepts such as feature engineering, cross-validation strategies, ensemble modeling, and post-processing methods commonly used in winning solutions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    IVY

    IVY

    The Unified Machine Learning Framework

    Take any code that you'd like to include. For example, an existing TensorFlow model, and some useful functions from both PyTorch and NumPy libraries. Choose any framework for writing your higher-level pipeline, including data loading, distributed training, analytics, logging, visualization etc. Choose any backend framework which should be used under the hood, for running this entire pipeline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TorchCode

    TorchCode

    Practice implementing softmax, attention, GPT-2 and more

    ...The platform provides a collection of curated problems that cover fundamental topics such as activation functions, normalization layers, attention mechanisms, and full transformer architectures. It runs in a Jupyter-based environment, allowing users to write, test, and debug their code interactively while receiving immediate feedback. An automated judging system evaluates correctness, gradient flow, and numerical stability, helping users understand both functional and theoretical aspects of their implementations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Semantic Router

    Semantic Router

    Superfast AI decision making and processing of multi-modal data

    Semantic Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow, unreliable LLM generations to make tool-use or safety decisions, we use the magic of semantic vector space — routing our requests using semantic meaning. Combining LLMs with deterministic rules means we can be confident that our AI systems behave as intended. Cramming agent tools into the limited context window is expensive, slow, and fundamentally limited. Semantic Router enables...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    segment-geospatial

    segment-geospatial

    A Python package for segmenting geospatial data with the SAM

    ...My primary objective is to simplify the process of leveraging SAM for geospatial data analysis by enabling users to achieve this with minimal coding effort. I have adapted the source code of segment-geospatial from the segment-anything-eo repository, and credit for its original version goes to Aliaksandr Hancharenka.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    OpenCLIP

    OpenCLIP

    An open source implementation of CLIP

    ...Specifically, a ResNet-50 model trained with our codebase on OpenAI's 15 million image subset of YFCC achieves 32.7% top-1 accuracy on ImageNet. OpenAI's CLIP model reaches 31.3% when trained on the same subset of YFCC. For ease of experimentation, we also provide code for training on the 3 million images in the Conceptual Captions dataset, where a ResNet-50x4 trained with our codebase reaches 22.2% top-1 ImageNet accuracy. This codebase is work in progress, and we invite all to contribute in making it more accessible and useful. In the future, we plan to add support for TPU training and release larger models. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    AutoGluon

    AutoGluon

    AutoGluon: AutoML for Image, Text, and Tabular Data

    ...Intended for both ML beginners and experts, AutoGluon enables you to quickly prototype deep learning and classical ML solutions for your raw data with a few lines of code. Automatically utilize state-of-the-art techniques (where appropriate) without expert knowledge. Leverage automatic hyperparameter tuning, model selection/ensembling, architecture search, and data processing. Easily improve/tune your bespoke models and data pipelines, or customize AutoGluon for your use-case. AutoGluon is modularized into sub-modules specialized for tabular, text, or image data. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Petastorm

    Petastorm

    Petastorm library enables single machine or distributed training

    ...It can also be used from pure Python code. A dataset created using Petastorm is stored in Apache Parquet format. On top of a Parquet schema, petastorm also stores higher-level schema information that makes multidimensional arrays into a native part of a petastorm dataset. Petastorm supports extensible data codecs. These enable a user to use one of the standard data compressions (jpeg, png) or implement her own.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. We also feature a deep integration with the Hugging Face Hub, allowing you to easily load and share a dataset with the wider NLP community. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    Hamilton DAGWorks

    Hamilton DAGWorks

    Helps scientists define testable, modular, self-documenting dataflow

    ...Your DAG is expressive; Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). To create a DAG, write regular Python functions that specify their dependencies with their parameters. As shown below, it results in readable code that can always be visualized. Hamilton loads that definition and automatically builds the DAG for you. Hamilton brings modularity and structure to any Python application moving data: ETL pipelines, ML workflows, LLM applications, RAG systems, BI dashboards, and the Hamilton UI allows you to automatically visualize, catalog, and monitor execution.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB