Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "k-means clustering"

x

Sort By:

Relevance

OS

Linux 56
Windows 48
Mac 45
More...
BSD 21
ChromeOS 12
Desktop Operating Systems 3
Game Consoles 1

Category

Artificial Intelligence 23
Scientific/Engineering 18
Software Development 16
System 9
Business 7
Internet 2
Education 1
Multimedia 1
Social sciences 1

License

OSI-Approved Open Source 42
Creative Commons Attribution License 1

Translations

English 13
German 2
Dutch 1
French 1
More...
Russian 1

Programming Language

Java 12
C++ 11
Python 9
MATLAB 6
More...
C 5
Go 3
Unix Shell 3
Julia 2
Perl 2
AWK 1
Groovy 1
Haskell 1
PHP 1
PL/SQL 1
S/R 1
Scala 1

Status

Beta 13
Production/Stable 11
Planning 4
Pre-Alpha 2
More...
Alpha 1
Mature 1

Showing 62 open source projects for "k-means clustering"

View related business solutions

Endpoint Protection Software for Businesses | HYPERSECURE
DriveLock protects systems, data, end devices from data loss and misuse.

The HYPERSECURE endpoint protection platform is a comprehensive suite of products and services enhanced by European third-party solutions. It ensures our customers’ IT security, regulatory compliance, and digital sovereignty.

Learn More
Award Winning Time and Labor Software
Synerion offers time and labor, advanced scheduling, absence management, labor allocation, timesheets, coreHR and more.

Stop wasting time and resources on manual and error-prone paper-based workforce management with Synerion. Synerion offers a comprehensive range of workforce management solutions that goes beyond time and tracking. The platform also offers enhanced scheduling features, labor costing, absence management, and payroll integration.

Learn More
1

Clustering.jl

A Julia package for data clustering

Methods for data clustering and evaluation of clustering quality.

Downloads: 4 This Week

Last Update: 2025-01-06
See Project
2

MatlabMachine

Machine learning algorithms

Matlab-Machine is a comprehensive collection of machine learning algorithms implemented in MATLAB. It includes both basic and advanced techniques for classification, regression, clustering, and dimensionality reduction. Designed for educational and research purposes, the repository provides clear implementations that help users understand core ML concepts.

Downloads: 2 This Week

Last Update: 2025-07-24
See Project
3

Machine Learning Octave

MatLab/Octave examples of popular machine learning algorithms

This repository contains MATLAB / Octave implementations of popular machine learning algorithms, along with explanatory code and mathematical derivations, intended as educational material rather than production code. Implementations of supervised learning algorithms (linear regression, logistic regression, neural nets). The author’s goal is to help users understand how each algorithm works “from scratch,” avoiding black-box library calls. Code written so as to expose and comment on...

Downloads: 0 This Week

Last Update: 2025-11-23
See Project
4

HDBSCAN

A high performance implementation of HDBSCAN clustering

HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over varying epsilon values and integrates the result to find a clustering that gives the best stability over epsilon. This allows HDBSCAN to find clusters of varying densities (unlike DBSCAN), and be more robust to parameter selection. In practice this means that HDBSCAN returns a good clustering straight away with little or no parameter tuning -- and the primary parameter, minimum cluster size, is intuitive and easy to select. ...

Downloads: 2 This Week

Last Update: 2026-03-27
See Project
JS7 JobScheduler is an open source workload automation solution.
JS7 offers cross-platform job execution, managed file transfer, complex no-code job dependencies and a real REST API.

JS7 JobScheduler is an open source workload automation solution. It is used to run executable files, shell scripts etc. and database procedures.

Learn More
5

dlib

Toolkit for making machine learning and data analysis applications

Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge. Good unit test coverage, the ratio of unit test lines of code to library lines of code is...

Downloads: 7 This Week

Last Update: 2026-03-29
See Project
6

HyperTools

A Python toolbox for gaining geometric insights

...Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot.

Downloads: 1 This Week

Last Update: 2026-01-29
See Project
7

kube-state-metrics

Add-on agent to generate and expose cluster-level metrics

kube-state-metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects. (See examples in the Metrics section below.) It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods. kube-state-metrics is about generating metrics from Kubernetes API objects without modification. This ensures that features provided by...

Downloads: 8 This Week

Last Update: 2026-01-18
See Project
8

Machine learning basics

Plain python implementations of basic machine learning algorithms

...Instead of relying on external machine learning libraries, the algorithms are implemented from scratch so that users can explore the mathematical logic and computational structure behind each technique. The repository includes notebooks that demonstrate classic algorithms such as linear regression, logistic regression, k-nearest neighbors, decision trees, support vector machines, and clustering techniques. Each notebook typically combines explanatory text, Python code, and visualizations to illustrate how the algorithm operates and how it can be applied to datasets.

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
9

Homemade Machine Learning

Python examples of popular machine learning algorithms

...Each algorithm is accompanied by mathematical explanations, visualizations (often via Jupyter notebooks), and interactive demos so you can tweak parameters, data, and observe outcomes in real time. The purpose is pedagogical: you’ll see linear regression, logistic regression, k-means clustering, neural nets, decision trees, etc., built in Python using fundamentals like NumPy and Matplotlib, not hidden behind API calls. It is well suited for learners who want to move beyond library usage to understand how algorithms operate internally—how cost functions, gradients, updates and predictions work.

Downloads: 0 This Week

Last Update: 2025-11-23
See Project
A privacy-first API that predicts global consumer preferences
Qloo AI adds value to a wide range of Fortune 500 companies in the media, technology, CPG, hospitality, and automotive sectors.

Through our API, we provide contextualized personalization and insights based on a deep understanding of consumer behavior and more than 575 million people, places, and things.

Learn More
10

Machine learning algorithms

Minimal and clean examples of machine learning algorithms

Machine learning algorithms is an open-source repository that provides minimal and clean implementations of machine learning algorithms written primarily in Python. The project focuses on demonstrating how fundamental machine learning methods work internally by implementing them from scratch rather than relying on high-level libraries. This approach allows learners to study the mathematical and algorithmic details behind widely used models in a transparent and readable way. The repository...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
11

Promxy

An aggregating proxy to enable HA prometheus

...Promxy delivers this unified access endpoint without requiring any sidecars, custom-builds, or other changes to your prometheus infrastructure. Prometheus itself provides no real HA/clustering support. As such the best-practice is to run multiple (e.g N) hosts with the same config. Similarly prometheus has no real built-in query federation, which means that you end up with N sources in grafana which is (1) confusing to grafana users and (2) has no support for aggregation across the sources. Promxy enables an HA prometheus setup by "merging" the data from the duplicate hosts (so if there is a gap in one, promxy will fill with the other). ...

Downloads: 5 This Week

Last Update: 2025-04-29
See Project
12

SageMaker Spark

A Spark library for Amazon SageMaker

...These pipelines interleave native Spark ML stages and stages that interact with SageMaker training and model hosting. With SageMaker Spark, you can train on Amazon SageMaker from Spark DataFrames using Amazon-provided ML algorithms like K-Means clustering or XGBoost, and make predictions on DataFrames against SageMaker endpoints hosting your trained models, and, if you have your own ML algorithms built into SageMaker compatible Docker containers, you can use SageMaker Spark to train and infer on DataFrames with your own algorithms -- all at Spark scale. SageMaker Spark depends on hadoop-aws-2.8.1. ...

Downloads: 0 This Week

Last Update: 2024-02-22
See Project
13

Armadillo

fast C++ library for linear algebra & scientific computing

* Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...

Downloads: 2,740 This Week

Last Update: 2 days ago
See Project
14

m23

Your linux deployment tool!

m23 is a free software distribution system (license: GPL), that installs (via network, starting with partitioning and formatting) and administrates (updates, adds / removes software, adds / removes scripts) clients with Debian, (X/K)Ubuntu and LinuxMint. It is used for deployment of Linux clients in schools, institutions and enterprises. The m23 server is controlled via a web interface. A new m23 client can be installed easily in only three steps. Group functions and mass installation...

Downloads: 37 This Week

Last Update: 2026-02-05
See Project
15

MooseFS

Fault tolerant, POSIX-compliant, Net Distributed Storage / File System

MooseFS (MFS) is a fault tolerant, highly performing, scaling-out, network distributed file system. It spreads data over several physical servers which are visible to the user as one resource. For standard file operations MooseFS mounted with FUSE acts as other Unix-alike file systems: * A hierarchical structure (directory tree) * Stores POSIX file attributes (permissions, last access and modification times) Supports special files (block and character devices, pipes and sockets) *...

7 Reviews

Downloads: 4 This Week

Last Update: 2026-03-18
See Project
16

Parallel and Distributed Process System

NOTICE OF CONSOLIDATION & PARTNERSHIP PENDING As of April 2026, the 20

NOTICE OF CONSOLIDATION & PARTNERSHIP PENDING As of April 2026, the 20 pipelines of the QCAUS/PDPBioGen suites are undergoing consolidation for high-scale institutional research. Core 'Ford 2026' algorithms remain the proprietary IP of the Ford Peace and Justice Foundation. Academic users at partner institutions are currently performing validation; all other commercial inquiries must contact the author Computational Neuroscience: Large-scale neural population dynamics, brain-inspired...

Downloads: 9 This Week

Last Update: 4 days ago
See Project
17

MLPACK C++ machine learning library

MLPACK is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and flexibility for expert users. * More info + downloads: https://mlpack.org * Git repo: https://github.com/mlpack/mlpack

Downloads: 0 This Week

Last Update: 2023-06-28
See Project
18

Texthero

Text preprocessing, representation and visualization from zero to hero

Texthero is a python package to work with text data efficiently. It empowers NLP developers with a tool to quickly understand any text-based dataset and it provides a solid pipeline to clean and represent text data, from zero to hero.

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
19

PyArmadillo

linear algebra library for Python

PyArmadillo - streamlined linear algebra library for Python, with emphasis on ease of use. Alternative to NumPy / SciPy. * Main page: https://pyarma.sourceforge.io * Documentation: https://pyarma.sourceforge.io/docs.html * Bug reports: https://pyarma.sourceforge.io/faq.html * Git repo: https://gitlab.com/jason-rumengan/pyarma

Downloads: 0 This Week

Last Update: 2023-04-19
See Project
20

DeepCluster

Deep Clustering for Unsupervised Learning of Visual Features

DeepCluster is a classic self-supervised clustering-based representation learning algorithm that iteratively groups image features and uses the cluster assignments as pseudo-labels to train the network. In each round, features produced by the network are clustered (e.g. k-means), and the cluster IDs become supervision targets in the next epoch, encouraging the model to refine its representation to better separate semantic groups.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
21

GSMLBook

Recipes for basic machine learning algorithms using sklearn in jupyter

...Topics include linear, multilinear, polynomial, stepwise, lasso, ridge, and logistic regression; ROC curves and measures of binary classification; nonlinear regression (including an introduction to gradient descent); classification and regression trees; random forests; neural networks; probabilistic methods (KNN, naive Bayes', QDA, LDA); dimensionality reduction with PCA; support vector machines; and clustering with K-Means, hierarchical, and DBScan. Appendices provide a review of probability and linear algebra. While some mathematical foundation is provided, it is not essential for understanding the implementations. The target audience is advanced community college and university students.

Downloads: 0 This Week

Last Update: 2019-12-07
See Project
22

Phusion2

The genome assembly pipeline based on read clustering

Phusion2 is a pipeline for de novo genome assembly using NGS data. It is based upon a strategy called read clustering. Starting with kmer frequency analysis, this allows for a reasonable selection of the kmer sizes. K-tuples from raw reads are merged and sorted into a table so that multiple occurring kmer words shared by different reads can be linked. A relation matrix is used to record the shared kmer words among all the reads. Setting a minimum threshold of shared k-tuples, the whole set of reads can then be clustered into groups using kmer sharing information in the relational matrix. ...

Downloads: 0 This Week

Last Update: 2019-04-02
See Project
23

Oryx

Lambda architecture on Apache Spark, Apache Kafka for real-time

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large-scale machine learning. It is a framework for building applications but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering. The application is written in Java, using Apache Spark, Hadoop, Tomcat, Kafka, Zookeeper and more. Configuration uses a single Typesafe Config config file, wherein...

Downloads: 0 This Week

Last Update: 2023-08-16
See Project
24

EECluster

Tool for energy-efficient resource management in HPC clusters

EECluster is software tool for managing the energy-efficient allocation of the cluster resources. EECluster uses a Hybrid Genetic Fuzzy System as the decision-making mechanism that elicits part of its rule base dependent on the cluster workload scenario, delivering good compliance with the administrator preferences. In the latest version, we leverage a more sophisticated and exhaustive model that covers a wider range of environmental aspects and balances service quality and power...

Downloads: 0 This Week

Last Update: 2021-10-30
See Project
25

NNC

Nuclear Norm Clustering

We present Nuclear Norm Clustering (NNC), an algorithm that can be used in different fields as a promising alternative to the k-means clustering method, and that is less sensitive to outliers. The NNC algorithm requires users to provide a data matrix M and a desired number of cluster K. We employed simulate annealing techniques to choose an optimal L that minimizes NN(L).

Downloads: 0 This Week

Last Update: 2018-07-07
See Project

Previous
You're on page 1
2
3
Next

Related Searches

cuda machine learning

octave

dlib-20.0.0-cp312-cp312-win_amd64.whl

trees

armadillo

live debian system

moosefs

k nearest neighbor

dbscan

forge genome assembler

Related Categories

Artificial Intelligence

Scientific/Engineering

Software Development

System

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise