Showing 62 open source projects for "k-means clustering"

View related business solutions
  • Endpoint Protection Software for Businesses | HYPERSECURE Icon
    Endpoint Protection Software for Businesses | HYPERSECURE

    DriveLock protects systems, data, end devices from data loss and misuse.

    The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.
    Learn More
  • Award Winning Time and Labor Software Icon
    Award Winning Time and Labor Software

    Synerion offers time and labor, advanced scheduling, absence management, labor allocation, timesheets, coreHR and more.

    Stop wasting time and resources on manual and error-prone paper-based workforce management with Synerion. Synerion offers a comprehensive range of workforce management solutions that goes beyond time and tracking. The platform also offers enhanced scheduling features, labor costing, absence management, and payroll integration.
    Learn More
  • 1
    Clustering.jl

    Clustering.jl

    A Julia package for data clustering

    Methods for data clustering and evaluation of clustering quality.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    MatlabMachine

    MatlabMachine

    Machine learning algorithms

    Matlab-Machine is a comprehensive collection of machine learning algorithms implemented in MATLAB. It includes both basic and advanced techniques for classification, regression, clustering, and dimensionality reduction. Designed for educational and research purposes, the repository provides clear implementations that help users understand core ML concepts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Machine Learning Octave

    Machine Learning Octave

    MatLab/Octave examples of popular machine learning algorithms

    This repository contains MATLAB / Octave implementations of popular machine learning algorithms, along with explanatory code and mathematical derivations, intended as educational material rather than production code. Implementations of supervised learning algorithms (linear regression, logistic regression, neural nets). The author’s goal is to help users understand how each algorithm works “from scratch,” avoiding black-box library calls. Code written so as to expose and comment on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    HDBSCAN

    HDBSCAN

    A high performance implementation of HDBSCAN clustering

    HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection. In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • JS7 JobScheduler is an open source workload automation solution. Icon
    JS7 JobScheduler is an open source workload automation solution.

    JS7 offers cross-platform job execution, managed file transfer, complex no-code job dependencies and a real REST API.

    JS7 JobScheduler is an open source workload automation solution. It is used to run executable files, shell scripts etc. and database procedures.
    Learn More
  • 5
    dlib

    dlib

    Toolkit for making machine learning and data analysis applications

    Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge. Good unit test coverage, the ratio of unit test lines of code to library lines of code is...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    ...Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    kube-state-metrics

    kube-state-metrics

    Add-on agent to generate and expose cluster-level metrics

    kube-state-metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. (See examples in the Metrics section below.) It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods. kube-state-metrics is about generating metrics from Kubernetes API objects without modification. This ensures that features provided by...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Machine learning basics

    Machine learning basics

    Plain python implementations of basic machine learning algorithms

    ...Instead of relying on external machine learning libraries, the algorithms are implemented from scratch so that users can explore the mathematical logic and computational structure behind each technique. The repository includes notebooks that demonstrate classic algorithms such as linear regression, logistic regression, k-nearest neighbors, decision trees, support vector machines, and clustering techniques. Each notebook typically combines explanatory text, Python code, and visualizations to illustrate how the algorithm operates and how it can be applied to datasets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Homemade Machine Learning

    Homemade Machine Learning

    Python examples of popular machine learning algorithms

    ...Each algorithm is accompanied by mathematical explanations, visualizations (often via Jupyter notebooks), and interactive demos so you can tweak parameters, data, and observe outcomes in real time. The purpose is pedagogical: you’ll see linear regression, logistic regression, k-means clustering, neural nets, decision trees, etc., built in Python using fundamentals like NumPy and Matplotlib, not hidden behind API calls. It is well suited for learners who want to move beyond library usage to understand how algorithms operate internally—how cost functions, gradients, updates and predictions work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • A privacy-first API that predicts global consumer preferences Icon
    A privacy-first API that predicts global consumer preferences

    Qloo AI adds value to a wide range of Fortune 500 companies in the media, technology, CPG, hospitality, and automotive sectors.

    Through our API, we provide contextualized personalization and insights based on a deep understanding of consumer behavior and more than 575 million people, places, and things.
    Learn More
  • 10
    Machine learning algorithms

    Machine learning algorithms

    Minimal and clean examples of machine learning algorithms

    Machine learning algorithms is an open-source repository that provides minimal and clean implementations of machine learning algorithms written primarily in Python. The project focuses on demonstrating how fundamental machine learning methods work internally by implementing them from scratch rather than relying on high-level libraries. This approach allows learners to study the mathematical and algorithmic details behind widely used models in a transparent and readable way. The repository...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Promxy

    Promxy

    An aggregating proxy to enable HA prometheus

    ...Promxy delivers this unified access endpoint without requiring any sidecars, custom-builds, or other changes to your prometheus infrastructure. Prometheus itself provides no real HA/clustering support. As such the best-practice is to run multiple (e.g N) hosts with the same config. Similarly prometheus has no real built-in query federation, which means that you end up with N sources in grafana which is (1) confusing to grafana users and (2) has no support for aggregation across the sources. Promxy enables an HA prometheus setup by "merging" the data from the duplicate hosts (so if there is a gap in one, promxy will fill with the other). ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    SageMaker Spark

    SageMaker Spark

    A Spark library for Amazon SageMaker

    ...These pipelines interleave native Spark ML stages and stages that interact with SageMaker training and model hosting. With SageMaker Spark, you can train on Amazon SageMaker from Spark DataFrames using Amazon-provided ML algorithms like K-Means clustering or XGBoost, and make predictions on DataFrames against SageMaker endpoints hosting your trained models, and, if you have your own ML algorithms built into SageMaker compatible Docker containers, you can use SageMaker Spark to train and infer on DataFrames with your own algorithms -- all at Spark scale. SageMaker Spark depends on hadoop-aws-2.8.1. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...
    Leader badge
    Downloads: 2,740 This Week
    Last Update:
    See Project
  • 14
    m23

    m23

    Your linux deployment tool!

    m23 is a free software distribution system (license: GPL), that installs (via network, starting with partitioning and formatting) and administrates (updates, adds / removes software, adds / removes scripts) clients with Debian, (X/K)Ubuntu and LinuxMint. It is used for deployment of Linux clients in schools, institutions and enterprises. The m23 server is controlled via a web interface. A new m23 client can be installed easily in only three steps. Group functions and mass installation...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 15
    MooseFS

    MooseFS

    Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

    MooseFS (MFS) is a fault tolerant, highly performing, scaling-out, network distributed file system. It spreads data over several physical servers which are visible to the user as one resource. For standard file operations MooseFS mounted with FUSE acts as other Unix-alike file systems: * A hierarchical structure (directory tree) * Stores POSIX file attributes (permissions, last access and modification times) Supports special files (block and character devices, pipes and sockets) *...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Parallel and Distributed Process System

    Parallel and Distributed Process System

    NOTICE OF CONSOLIDATION & PARTNERSHIP PENDING As of April 2026, the 20

    NOTICE OF CONSOLIDATION & PARTNERSHIP PENDING As of April 2026, the 20 pipelines of the QCAUS/PDPBioGen suites are undergoing consolidation for high-scale institutional research. Core 'Ford 2026' algorithms remain the proprietary IP of the Ford Peace and Justice Foundation. Academic users at partner institutions are currently performing validation; all other commercial inquiries must contact the author Computational Neuroscience: Large-scale neural population dynamics, brain-inspired...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    MLPACK is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and flexibility for expert users. * More info + downloads: https://mlpack.org * Git repo: https://github.com/mlpack/mlpack
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Texthero

    Texthero

    Text preprocessing, representation and visualization from zero to hero

    Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PyArmadillo

    PyArmadillo

    linear algebra library for Python

    PyArmadillo - streamlined linear algebra library for Python, with emphasis on ease of use. Alternative to NumPy / SciPy. * Main page: https://pyarma.sourceforge.io * Documentation: https://pyarma.sourceforge.io/docs.html * Bug reports: https://pyarma.sourceforge.io/faq.html * Git repo: https://gitlab.com/jason-rumengan/pyarma
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DeepCluster

    DeepCluster

    Deep Clustering for Unsupervised Learning of Visual Features

    DeepCluster is a classic self-supervised clustering-based representation learning algorithm that iteratively groups image features and uses the cluster assignments as pseudo-labels to train the network. In each round, features produced by the network are clustered (e.g. k-means), and the cluster IDs become supervision targets in the next epoch, encouraging the model to refine its representation to better separate semantic groups.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    GSMLBook

    GSMLBook

    Recipes for basic machine learning algorithms using sklearn in jupyter

    ...Topics include linear, multilinear, polynomial, stepwise, lasso, ridge, and logistic regression; ROC curves and measures of binary classification; nonlinear regression (including an introduction to gradient descent); classification and regression trees; random forests;  neural networks; probabilistic methods (KNN, naive Bayes', QDA, LDA); dimensionality reduction with PCA; support vector machines; and clustering with K-Means, hierarchical, and DBScan. Appendices provide a review of probability and linear algebra. While some mathematical foundation is provided, it is not essential for understanding the implementations. The target audience is advanced community college and university students.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Phusion2

    The genome assembly pipeline based on read clustering

    Phusion2 is a pipeline for de novo genome assembly using NGS data. It is based upon a strategy called read clustering. Starting with kmer frequency analysis, this allows for a reasonable selection of the kmer sizes. K-tuples from raw reads are merged and sorted into a table so that multiple occurring kmer words shared by different reads can be linked.  A relation matrix is used to record the shared kmer words among all the reads. Setting a minimum threshold of shared k-tuples, the whole set of reads can then be clustered into groups using kmer sharing information in the relational matrix. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Oryx

    Oryx

    Lambda architecture on Apache Spark, Apache Kafka for real-time

    Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large-scale machine learning. It is a framework for building applications but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering. The application is written in Java, using Apache Spark, Hadoop, Tomcat, Kafka, Zookeeper and more. Configuration uses a single Typesafe Config config file, wherein...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    EECluster

    EECluster

    Tool for energy-efficient resource management in HPC clusters

    EECluster is software tool for managing the energy-efficient allocation of the cluster resources. EECluster uses a Hybrid Genetic Fuzzy System as the decision-making mechanism that elicits part of its rule base dependent on the cluster workload scenario, delivering good compliance with the administrator preferences. In the latest version, we leverage a more sophisticated and exhaustive model that covers a wider range of environmental aspects and balances service quality and power...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    NNC

    Nuclear Norm Clustering

    We present Nuclear Norm Clustering (NNC), an algorithm that can be used in different fields as a promising alternative to the k-means clustering method, and that is less sensitive to outliers. The NNC algorithm requires users to provide a data matrix M and a desired number of cluster K. We employed simulate annealing techniques to choose an optimal L that minimizes NN(L).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB