Page 2 | nlp free download

Showing 477 open source projects for "nlp"

View related business solutions

Cortex: Boost Developer Coding Skills
Cortex makes coding easier and faster for developers. See how our portal connects tools and cuts busywork.

Cortex is a simple portal that helps developers work smarter by linking all your tools, setting clear rules, and slashing repetitive tasks. It speeds up onboarding, updates old code, and fixes issues fast. Over 100 big companies use it to save time and get better results.

Try it now!
AI-Powered Identity Governance
For IT Teams and MSPs in need of a solution to simplify, optimize and secure their SaaS, file, and device management operations

Define governance policies, manage access, and optimize licenses with unified visibility across every identity, app, and file.

Learn More
1

WikiChat

WikiChat is an improved RAG

WikiChat is a chatbot framework designed to interactively retrieve and summarize Wikipedia information, allowing users to ask questions and get context-aware responses?

Downloads: 0 This Week

Last Update: 2025-04-29
See Project
2

Docspell

Assist in organizing your piles of documents

...Docspell can help by suggesting correspondents, guessing tags or finding dates using machine learning. It can learn metadata from existing documents and find things using NLP. This makes adding metadata to your documents a lot easier. For machine learning, it relies on the free (GPL) Stanford Core NLP library.

Downloads: 5 This Week

Last Update: 2025-03-15
See Project
3

Data-Juicer

Data processing for and with foundation models

Data-Juicer is an open-source data processing and augmentation framework designed to enhance the quality and diversity of datasets for machine learning tasks. It includes a modular pipeline for scalable data transformation.

Downloads: 1 This Week

Last Update: 2026-03-17
See Project
4

Search-Index

A persistent, network resilient, full text search library

Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.

Downloads: 11 This Week

Last Update: 2025-03-12
See Project
PageDNA: Web-to-Print eCommerce Software
eCommerce for Print, Signs and Fulfillment Trusted by In‑Plants and Commercial Print Leaders

PageDNA enables successful eCommerce strategies for commercial print sales organizations, internal print shops, and brand owners. PageDNA’s online ordering platform increases print volume while decreasing touch costs for all stakeholders: clientele, print operations, and the organizations they support.

Learn More
5

textlint

The pluggable natural language linter for text and markdown

Textlint is an extensible linting tool for text and markdown files, designed to enforce style guidelines, detect errors, and improve writing quality.

Downloads: 11 This Week

Last Update: 2026-04-08
See Project
6

DeepPavlov

A library for deep learning end-to-end dialog systems and chatbots

...Follow step-by-step instructions to install, configure and extend DeepPavlov framework for your use case. DeepPavlov is an open-source framework for chatbots and virtual assistants development. It has comprehensive and flexible tools that let developers and NLP researchers create production-ready conversational skills and complex multi-skill conversational assistants. Use BERT and other state-of-the-art deep learning models to solve classification, NER, Q&A and other NLP tasks. DeepPavlov Agent allows building industrial solutions with multi-skill integration via API services.

Downloads: 0 This Week

Last Update: 2024-08-12
See Project
7

LightAutoML

Fast and customizable framework for automatic ML model creation

LightAutoML is an automated machine learning (AutoML) framework optimized for efficient model training and hyperparameter tuning, focusing on both tabular and text data.

Downloads: 0 This Week

Last Update: 2025-12-04
See Project
8

HanLP

Han Language Processing

HanLP is a multilingual Natural Language Processing (NLP) library composed of a series of models and algorithms. Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis.

Downloads: 8 This Week

Last Update: 2025-03-07
See Project
9

ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs

ExtractThinker is a tool designed to facilitate the extraction and analysis of information from various data sources, aiding in data processing and knowledge discovery.

Downloads: 8 This Week

Last Update: 2025-06-09
See Project
Fully managed relational database service for MySQL, PostgreSQL, and SQL Server
Focus on your application, and leave the database to us

Cloud SQL manages your databases so you don't have to, so your business can run without disruption. It automates all your backups, replication, patches, encryption, and storage capacity increases to give your applications the reliability, scalability, and security they need.

Try for free
10

Adapters

A Unified Library for Parameter-Efficient Learning

...Q-LoRA, Q-Bottleneck Adapters, or Q-PrefixTuning), adapter merging via task arithmetics or the composition of multiple adapters via composition blocks, allowing advanced research in parameter-efficient transfer learning for NLP tasks.

Downloads: 0 This Week

Last Update: 2025-05-20
See Project
11

Chonkie

The no-nonsense RAG chunking library

Chonkie is an AI-powered framework designed for building conversational agents and chatbots with natural language understanding and multi-turn conversation support.

Downloads: 7 This Week

Last Update: 2025-03-01
See Project
12

Stanford CoreNLP

Stanford CoreNLP, a Java suite of core NLP tools

...CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.

Downloads: 4 This Week

Last Update: 2025-06-07
See Project
13

spacy-llm

Integrating LLMs into structured NLP pipelines

Large Language Models (LLMs) feature powerful natural language understanding capabilities. With only a few (and sometimes no) examples, an LLM can be prompted to perform custom NLP tasks such as text categorization, named entity recognition, coreference resolution, information extraction and more. This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various NLP tasks, no training data required.

Downloads: 4 This Week

Last Update: 2026-03-24
See Project
14

Datasets

Hub of ready-to-use datasets for ML models

Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency.

Downloads: 5 This Week

Last Update: 2026-03-23
See Project
15

PaperAI

Semantic search and workflows for medical/scientific papers

PaperAI is an open-source framework for searching and analyzing scientific papers, particularly useful for researchers looking to extract insights from large-scale document collections.

Downloads: 8 This Week

Last Update: 2025-07-01
See Project
16

BEIR

A Heterogeneous Benchmark for Information Retrieval

BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.

Downloads: 5 This Week

Last Update: 2025-06-04
See Project
17

DeepSparse

Sparsity-aware deep learning inference runtime for CPUs

A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).

Downloads: 0 This Week

Last Update: 2025-06-02
See Project
18

Dawarich

Self-hostable alternative to Google Timeline

Dawarich is a command-line tool (likely Ruby-based) for transforming and analyzing Arabic text data with normalization, diacritic handling, segmentation, and morphological tokenization. Designed for text mining and NLP workflows in Arabic-language contexts.

Downloads: 6 This Week

Last Update: 2026-04-01
See Project
19

FastRAG

Efficient Retrieval Augmentation and Generation Framework

fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool set for advancing retrieval augmented generation.

Downloads: 6 This Week

Last Update: 2025-01-24
See Project
20

Detoxify

Trained models & code to predict toxic comments

Detoxify is a deep learning-based tool for detecting and filtering toxic language in online conversations, leveraging Transformer models for high accuracy.

Downloads: 2 This Week

Last Update: 2026-03-26
See Project
21

Stanza

Stanford NLP Python library for many human languages

Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. ...

Downloads: 3 This Week

Last Update: 2026-02-26
See Project
22

Classical Language Toolkit (CLTK)

The Classical Language Toolkit

The Classical Language Toolkit (CLTK) is a Python library offering natural language processing support for classical languages, including Latin, Greek, and others.

Downloads: 5 This Week

Last Update: 2025-05-04
See Project
23

STORM

An LLM-powered knowledge curation system that researches topics

STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.

Downloads: 5 This Week

Last Update: 2025-01-23
See Project
24

Natural Language Toolkit

NLTK Source

The Natural Language Toolkit (NLTK) is a widely used open-source Python library designed for working with human language data and building natural language processing (NLP) applications. It provides a comprehensive suite of modules, datasets, and tutorials that support both symbolic and statistical approaches to language processing. The toolkit includes implementations of many foundational NLP algorithms and utilities, enabling developers to perform tasks such as tokenization, stemming, parsing, classification, and semantic reasoning. ...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
25

AdalFlow

The library to build & auto-optimize LLM applications

AdalFlow is a framework for building AI-powered automation workflows, enabling users to design and execute intelligent automation pipelines with minimal coding.

Downloads: 3 This Week

Last Update: 2025-09-25
See Project