Page 3 | /storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files free download

BioNLP

BioNLP is an initiative by the University of Colorado Denver Health Sciences Center to create and distribute code, software, and data for applying natural language processing techniques to biomedical texts

Downloads: 0 This Week

Last Update: 2022-10-26

See Project

Graph4NLP

Graph4nlp is the library for the easy use of Graph Neural Networks

...Graph4NLP consists of four different layers: 1) Data Layer, 2) Module Layer, 3) Model Layer, and 4) Application Layer. Graph4nlp aims to make it incredibly easy to use GNNs in NLP tasks (check out Graph4NLP Documentation).

Downloads: 3 This Week

Last Update: 2022-08-16

See Project

MITRE Annotation Toolkit

A toolkit for managing and manipulating text annotations

...It can be customized for specific tasks (e.g., named entity identification, de-identification of medical records). The goal of MAT is not to help you configure your training engine (in the default case, the Carafe CRF system) to achieve the best possible performance on your data. MAT is for "everything else": all the tools you end up wishing you had.

Downloads: 2 This Week

Last Update: 2023-04-19

See Project

Data augmentation

List of useful data augmentation resources

List of useful data augmentation resources. You will find here some links to more or less popular github repos, libraries, papers, and other information. Data augmentation can be simply described as any method that makes our dataset larger. To create more images for example, we could zoom in and save a result, we could change the brightness of the image or rotate it.

Downloads: 0 This Week

Last Update: 2023-03-21

See Project

libpostal

A C library for parsing/normalizing street addresses around the world

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data. libpostal is a C library for parsing/normalizing street addresses around the world using statistical NLP and open data. The goal of this project is to understand location-based strings in every language, everywhere. Addresses and the locations they represent are essential for any application dealing with maps (place search, transportation, on-demand/delivery services, check-ins, reviews). ...

Downloads: 0 This Week

Last Update: 2022-05-02

See Project

XLM (Cross-lingual Language Model)

PyTorch original implementation of Cross-lingual Language Model

XLM (Cross-lingual Language Model) is a family of multilingual pretraining methods that align representations across languages to enable strong zero-shot transfer. It popularized objectives like Masked Language Modeling (MLM) across many languages and Translation Language Modeling (TLM) that jointly trains on parallel sentence pairs to tighten cross-lingual alignment. Using a shared subword vocabulary, XLM learns language-agnostic features that work well for classification and sequence labeling tasks such as XNLI, NER, and POS without target-language supervision. ...

Downloads: 0 This Week

Last Update: 2025-10-07

See Project

Duckling

Language, engine, and tooling for testing composable language rules

Duckling is a Haskell library developed by Facebook for parsing and normalizing natural language expressions into structured data. It supports a wide range of entities such as dates, times, durations, distances, temperatures, numbers, and currencies. Designed for use in conversational agents, chatbots, and natural language processing applications, Duckling converts fuzzy user input into a consistent and machine-readable format. It features multi-language support and is widely used in production environments requiring robust entity extraction.

Downloads: 0 This Week

Last Update: 2025-07-17

See Project

Self-Attentive Parser

High-accuracy NLP parser with models for 11 languages

LightAutoML is an automated machine learning (AutoML) framework developed by Sberbank AI Lab, designed to facilitate the development of machine learning models with minimal human intervention.

Downloads: 0 This Week

Last Update: 2025-01-30

See Project

Synonyms

Chinese synonyms, chat robot, intelligent question and answer toolkit

...Classes and subclasses, sort out the relationship between words, the extended version of the synonym word forest contains more than 70,000 words, of which more than 30,000 words are shared in the form of open data.

Downloads: 0 This Week

Last Update: 2022-01-14

See Project

Parsr

Transforms PDF, Documents and Images into Enriched Structured Data

Parsr is an open-source document parsing tool that converts PDFs, scanned images, and other structured documents into structured, machine-readable data formats.

Downloads: 3 This Week

Last Update: 2025-01-21

See Project

NLP Architect

A model library for exploring state-of-the-art deep learning

...The library is designed to be a tool for model development: data pre-processing, build model, train, validate, infer, save or load a model.

Downloads: 0 This Week

Last Update: 2022-08-05

See Project

fastNLP

fastNLP: A Modularized and Extensible NLP Framework

fastNLP is a lightweight framework for natural language processing (NLP), the goal is to quickly implement NLP tasks and build complex models. A unified Tabular data container simplifies the data preprocessing process. Built-in Loader and Pipe for multiple datasets, eliminating the need for preprocessing code. Various convenient NLP tools, such as Embedding loading (including ELMo and BERT), intermediate data cache, etc.. Provide a variety of neural network components and recurrence models (covering tasks such as Chinese word segmentation, named entity recognition, syntactic analysis, text classification, text matching, metaphor resolution, summarization, etc.). ...

Downloads: 0 This Week

Last Update: 2022-08-05

See Project

CC-Net

Tools to download and cleanup Common Crawl data

cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or...

Downloads: 0 This Week

Last Update: 2025-10-11

See Project

GluonNLP

NLP made easy

GluonNLP is a toolkit that helps you solve NLP problems. It provides easy-to-use tools that helps you load the text data, process the text data, and train models. To facilitate both the engineers and researchers, we provide command-line-toolkits for downloading and processing the NLP datasets. Gluon NLP makes it easy to evaluate and train word embeddings. Here are examples to evaluate the pre-trained embeddings included in the Gluon NLP toolkit as well as example scripts for training embeddings on custom datasets. ...

Downloads: 0 This Week

Last Update: 2022-08-08

See Project

Delta ML

Deep learning based natural language and speech processing platform

...DELTA has been used for developing several state-of-the-art algorithms for publications and delivering real production to serve millions of users. It helps you to train, develop, and deploy NLP and/or speech models. Use configuration files to easily tune parameters and network structures. What you see in training is what you get in serving: all data processing and features extraction are integrated into a model graph. Text classification, named entity recognition, question and answering, text summarization, etc. Uniform I/O interfaces and no changes for new models.

Downloads: 0 This Week

Last Update: 2022-08-15

See Project

NLP Best Practices

Natural Language Processing Best Practices & Examples

In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora. This repository contains examples and best practices for building NLP systems, provided as Jupyter notebooks and utility functions. The focus of the repository is on state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language. ...

Downloads: 0 This Week

Last Update: 2022-08-01

See Project

cocoNLP

A Chinese information extraction tool

...Its API is intentionally simple, so you can drop it into scripts, ETL jobs, or dashboards without deep ML expertise. Because it aims at utility over complexity, it’s useful for prototyping data products or building lightweight text analytics where large models would be overkill. The repository also includes examples and test snippets to help you understand expected inputs and typical outputs, which shortens the learning curve for newcomers.

Downloads: 0 This Week

Last Update: 2025-11-05

See Project

KSUCCA Corpus

A 50 million tokens corpus of Classical Arabic.

King Saud University Corpus of Classical Arabic (KSUCCA) is a pioneering 50 million tokens annotated corpus of Classical Arabic texts from the period of pre-Islamic era until the fourth Hijri century (equivalent to the period from the seventh until early eleventh century CE), which is the period of pure classical Arabic. The main aim of this corpus is to be used for studying the distributional lexical semantics of The Quran words. However, it can be used for other research purposes, such...

Downloads: 7 This Week

Last Update: 2020-02-19

See Project

Chatito

Dataset generation for AI chatbots, NLP tasks

Chatito is a tool that helps generate datasets for training and validating chatbot models using a simple domain-specific language (DSL).

Downloads: 4 This Week

Last Update: 2025-01-30

See Project

Rasa-UI

Rasa UI is a frontend for the Rasa Framework

Rasa UI is a web application built on top of, and for Rasa. Rasa UI provides a web application to quickly and easily be able to create and manage bots, NLU components (Regex, Examples, Entities, Intents, etc.) and Core components (Stories, Actions, Responses, etc.) through a web interface. It also provides some convenience features for Rasa, like training and loading your models, monitoring usage or viewing logs.

Downloads: 2 This Week

Last Update: 2025-01-24

See Project

OLiA

OWL/DL ontologies for linguistic annotations

MOVED TO https://github.com/acoli-repo/olia. The Ontologies of Linguistic Annotations (OLiA) provide an OWL/DL taxonomy of data categories as a reference for linguistic annotation (OLiA Reference Model), plus OWL/DL models for a large number of annotation schemes (OLiA Annotation Models) and their relationship to reference data categories (OLiA Linking Models). The OLiA Reference Model itself is linked to community-maintained repositories such as GOLD (http://linguistics-ontology.org/) and ISOcat (http://www.isocat.org) The OLiA ontologies were originally developed as part of an infrastructure for the sustainable maintenance of linguistic resources (http://www.sfb441.uni-tuebingen.de/c2/index-engl.html), their fields of application include the formalization of annotation schemes, concept-based querying over heterogeneously annotated corpora, and the development of interoperable NLP pipelines.

Downloads: 1 This Week

Last Update: 2019-11-11

See Project

artext

Probabilistic Noising of Natural Language

Artext is a work on injecting noise into text without affecting the core meaning for a human reader. This kind of data can be useful for many NLP tasks, particulary to make models robust to erroneous text. This is a work in progress, and we will publish the results of our experiments soon. Meanwhile, if you use artext in your research please cite this repository. Github: https://github.com/nlpcl-lab/artext

Downloads: 0 This Week

Last Update: 2019-11-07

See Project

Safe Harbor Deidentification

Safe Harbor Deidentification for medical documents

Phalanx - Deidentify Safe Harbor Deidentification Mode of Phalanx is an abridged pipeline of NLP annotators culminating in NER annotators which write output of text offsets. It uses the Safe Harbor deidentification method.

Downloads: 0 This Week

Last Update: 2019-09-10

See Project

TIES

A smart search engine for medical documents

TIES (Text Information Extraction System) is a clinical text search engine that uses Natural Language Processing techniques to extract medical concepts from free text clinical reports. It provides secure de-identified access to this information and has in built collaboration tools and honest broker functionality. It is licensed for academic use under the BSD license. For commercial use please contact Nexi at http://nexihub.com *** NOTICE: this software and forum are no longer...

1 Review

Downloads: 0 This Week

Last Update: 2019-09-09

See Project

TEXT2DATA

Text Analytics Platform

Bring Text Analytics Platform that uses NLP (Natural Language Processing) and Machine Learning to your work environment. Extract essential information from your text documents and let Artificial Intelligence save your time. Get detailed and agile reports on your unstructured data.

Downloads: 0 This Week

Last Update: 2019-07-17

See Project

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files" - Page 3

Showing 110 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

BioNLP

Graph4NLP

MITRE Annotation Toolkit

Data augmentation

libpostal

XLM (Cross-lingual Language Model)

Duckling

Self-Attentive Parser

Synonyms

Parsr

NLP Architect

fastNLP

CC-Net

GluonNLP

Delta ML

NLP Best Practices

cocoNLP

KSUCCA Corpus

Chatito

Rasa-UI

OLiA

artext

Safe Harbor Deidentification

TIES

TEXT2DATA

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files" - Page 3

Showing 110 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

BioNLP

Graph4NLP

MITRE Annotation Toolkit

Data augmentation

libpostal

XLM (Cross-lingual Language Model)

Duckling

Self-Attentive Parser

Synonyms

Parsr

NLP Architect

fastNLP

CC-Net

GluonNLP

Delta ML

NLP Best Practices

cocoNLP

KSUCCA Corpus

Chatito

Rasa-UI

OLiA

artext

Safe Harbor Deidentification

TIES

TEXT2DATA

Related Searches

Related Categories