Showing 392 open source projects for "vocabulary"

View related business solutions
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Learn More
  • 1
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. ...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 2
    Vosk Speech Recognition Toolkit

    Vosk Speech Recognition Toolkit

    Offline speech recognition API for Android, iOS, Raspberry Pi

    ...It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but provide continuous large vocabulary transcription, zero-latency response with streaming API, reconfigurable vocabulary and speaker identification. Speech recognition bindings are implemented for various programming languages like Python, Java, Node.JS, C#, C++, Rust, Go and others. Vosk supplies speech recognition for chatbots, smart home appliances, and virtual assistants. ...
    Downloads: 103 This Week
    Last Update:
    See Project
  • 3
    My Vocabulary

    My Vocabulary

    Simple Vocabulary app

    A tiny, always-on-top overlay flashcard app for effortless vocabulary learning while you work or browse. ⚠️ Note about full-screen games Some games use exclusive fullscreen (DirectX/OpenGL/Vulkan). In that mode overlays cannot draw on top. Switch to borderless windowed (fullscreen) mode instead.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Rime ICE

    Rime ICE

    rime-ice is a highly optimized schema for the RIME input method

    rime-ice is a highly optimized schema for the RIME (中州韻) input method engine, offering a clean, intelligent, and efficient Chinese input experience. Built with modular configuration files and designed for performance, rime-ice provides powerful input suggestions, simplified vocabulary, and flexible customization, catering to users who want a streamlined and practical Chinese typing setup.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Track time for payroll, billing and productivity Icon
    Track time for payroll, billing and productivity

    Flexible time and billing software that enables teams to easily track time and expenses for payroll, projects, and client billing.

    Because time is money, and we understand how challenging it can be to keep track of employee hours. The constant reminder to log timesheets so your business can increase billables, run an accurate payroll and remove the guesswork from project estimates – we get it.
    Learn More
  • 5
    SentencePiece

    SentencePiece

    Unsupervised text tokenizer for Neural Network-based text generation

    SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabulary size is predetermined prior to the neural model training. SentencePiece implements subword units (e.g., byte-pair-encoding (BPE) [Sennrich et al.]) and unigram language model [Kudo.]) with the extension of direct training from raw sentences. SentencePiece allows us to make a purely end-to-end system that does not depend on language-specific pre/postprocessing. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    HomeRobot

    HomeRobot

    Mobile manipulation research tools for roboticists

    ...It provides interfaces for Detic, Grounded-SAM, and Contact-GraspNet, allowing open-vocabulary detection and 3D grasping.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Pot Desktop

    Pot Desktop

    A cross-platform software for text translation and recognition

    ...The tool supports external plugin extensions, which means its functionality can be expanded far beyond the built-in options: you can add translation engines, OCR backends, TTS engines, vocabulary export (e.g. for language learning), and more. Pot-Desktop works on Windows, macOS, and Linux (including Wayland environments), and offers convenient installers or package-manager installation methods (e.g. via brew or .deb, etc.), so it’s accessible for users on all major desktop OSes.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 8
    alphageometry

    alphageometry

    AI-driven neuro-symbolic solver for high-school geometry problems

    ...The DDAR solver focuses purely on rule-based reasoning, while AlphaGeometry enhances this by using a learned model to suggest auxiliary constructions when logical reasoning alone is insufficient. The repository includes pre-trained weights, vocabulary files, and detailed configuration options for reproducing experiments.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Schema.DTS

    Schema.DTS

    JSON-LD TypeScript types for Schema vocabulary

    The project provides a comprehensive set of TypeScript typings based on the Schema vocabulary, enabling developers to author JSON-LD structured data with strong type safety. It supplies both high-level discriminated unions and helper types to model contexts, graphs, and linked data relationships with clarity and accuracy. Usage examples demonstrate how one can import types like Person, WithContext, or Graph and compose JSON-LD objects in a way that aligns with semantic-web and knowledge-graph practices. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Intelligent testing agents | Checksum.ai Icon
    Intelligent testing agents | Checksum.ai

    Checksum generates, runs, and maintains end-to-end tests automatically so your team ships with confidence as code output grows.

    Coding agents write the code. Checksum runs it—continuously testing against real APIs, real data, real edge cases—before it ever reaches production.
    Learn More
  • 10
    English-level-up-tips

    English-level-up-tips

    An advanced guide to learn English which might benefit you a lot

    English-level-up-tips is a comprehensive open-source guide designed to help learners improve their English language skills across a broad range of competencies, from vocabulary and grammar to listening, speaking, reading, and writing. Structured as a language learning tutorial, the project aggregates tips, strategies, explanations, and resources that go beyond simple phrase lists, encouraging learners to develop a deep understanding of how English works and how to use it effectively. The repository includes structured sections that address different skill areas with lessons, exercises, and recommended approaches tailored to learners at various stages of proficiency. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    minbpe

    minbpe

    Minimal, clean code for the Byte Pair Encoding (BPE) algorithm

    ...It operates on UTF-8 encoded bytes rather than Unicode characters, which makes it robust to arbitrary text inputs and avoids needing a language-specific character vocabulary. The repository is structured as a teaching-oriented implementation that shows how to train a tokenizer by learning merge rules, then apply those merges to encode text into token IDs and decode tokens back into text. It is intentionally small and readable so developers can understand each stage of BPE, including the mechanics of pair counting, merge application, and vocabulary growth. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    IBM-ODM-Docker

    IBM-ODM-Docker

    This repository allows to deploy an IBM Operational Decision Manager

    ...IBM ODM is a decisioning platform to automate your business policies. Business rules are used at the heart of the platform to implement decision logic on a business vocabulary and run it as web decision services.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ML Ferret

    ML Ferret

    Refer and Ground Anything Anywhere at Any Granularity

    Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Mistral Finetune

    Mistral Finetune

    Memory-efficient and performant finetuning of Mistral's models

    ...The repo includes utilities for data preprocessing (e.g. reformat_data.py), validation scripts, and example YAML configs for training variants like 7B base or instruct models. It supports function-calling style datasets (via "messages" keys) as well as plain text formats, with guidelines on formatting, tokenization, and vocabulary extension (e.g. extending vocab to 32768 for some models) before finetuning. The project also provides tutorial notebooks (e.g. mistral_finetune_7b.ipynb) to walk through the steps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Read Frog

    Read Frog

    Open Source Immersive Translate

    Read Frog is an open-source browser extension designed to transform everyday web reading into an immersive language learning experience powered by artificial intelligence. The tool integrates translation, contextual explanations, and content analysis directly into the browsing workflow so users can learn languages naturally while reading authentic online content. Instead of forcing learners to switch between translation tools and the original text, the extension displays translations...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Amical

    Amical

    Open Source AI Dictation App

    ...It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the platform. Users can enhance transcription accuracy with custom vocabulary tailored to industry jargon, proper nouns, and personal terms, and set up personalized voice shortcuts to trigger workflows or dictate across applications. Amical supports multilingual dictation with over 50 languages at native-level accuracy. Its features include a floating desktop widget for easy access, voice-activated commands, custom hotkeys, transcription history, and more.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 17
    carbon-components-svelte

    carbon-components-svelte

    Svelte implementation of the Carbon Design System

    ...The library also sits within a broader Carbon Svelte ecosystem that includes icon components, pictograms, charts, and preprocessors, which makes it possible to assemble full product interfaces with a unified design vocabulary. Its styling model supports multiple official themes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Hera

    Hera

    Hera is an Argo Python SDK

    ...Hera aims to make the construction and submission of various Argo Project resources easy and accessible to everyone! Hera abstracts away low-level setup details while still maintaining a consistent vocabulary with Argo.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    WeClone

    WeClone

    One-stop solution for creating your digital avatar from chat history

    ...It is intended primarily as an experimental exploration of digital personality modeling and conversational AI personalization. By processing large volumes of conversation data, WeClone can build a profile of an individual’s writing tone, vocabulary preferences, and conversational tendencies. Developers can use the resulting model to create chatbots that simulate a specific user’s communication patterns for testing or research purposes. Overall, WeClone explores the idea of digital identity replication through machine learning and conversational modeling.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    ...The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. It supports customizable text pre-processors, which can correct pronunciations, tweak formatting, or handle domain-specific vocabulary before sending it to the API. gTTS is primarily aimed at developers who want a quick way to add cloud-backed speech to scripts, apps, or pipelines without managing any model weights locally. A small CLI utility, gtts-cli, makes it easy to test or batch-generate MP3 files right from the shell.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    API Platform Core

    API Platform Core

    The server component of API Platform, hypermedia and GraphQL APIs

    ...It is a component of the API Platform framework and it can be integrated with the Symfony framework using the bundle distributed with the library. It natively supports popular open formats including JSON for Linked Data (JSON-LD), Hydra Core Vocabulary, OpenAPI v2 (formerly Swagger) and v3, HAL and Problem Details. Build a working and fully-featured CRUD API in minutes. Leverage the awesome features of the tool to develop complex and high-performance API-first projects. Extend or override everything you want.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Marksheet

    Marksheet

    Free tutorial to learn HTML and CSS

    ...It explains core building blocks—elements, attributes, selectors, the box model, positioning—and connects them to the mental models needed for real layouts. The writing style aims to demystify jargon and teach a consistent vocabulary so learners can understand documentation and tutorials elsewhere. It includes diagrams and compact examples that illustrate concepts without burying readers in boilerplate. The material emphasizes progressive mastery, encouraging learners to build small pages and refine them with better structure and style. It’s a useful reference to revisit after your first projects, reinforcing fundamentals that make larger frameworks easier to learn later.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Front-End Design Checklist

    Front-End Design Checklist

    The Design Checklist for Creative Web Designers

    ...The resource includes checks for responsive breakpoints, interaction states, accessibility considerations, and asset preparation, reducing rework later in the build. It promotes shared vocabulary and artifacts, helping teams avoid ambiguities around components, states, and edge cases. By using it early in the process, teams can prevent visual drift, inconsistent spacing, and incomplete specifications. The result is a repeatable, predictable path from mockup to production-quality UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A Fluent Builder For Schema.org Types

    A Fluent Builder For Schema.org Types

    A fluent builder Schema.org types and ld+json generator

    spatie/schema-org provides a fluent builder for all Schema.org types and their properties. The code in src is generated from Schema.org's JSON-LD standards file, so it provides objects and methods for the entire core vocabulary. The classes and methods are also fully documented as a quick reference. We highly appreciate you sending us a postcard from your hometown, mentioning which of our package(s) you are using. You'll find our address on our contact page. We publish all received postcards on our virtual postcard wall. If you don't want to break the chain of a large schema object, you can use the if method to conditionally modify the schema. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    llmx.txt hub

    llmx.txt hub

    The largest directory for AI-ready documentation and tools

    llms-txt-hub serves as a central directory and knowledge base for the emerging llms.txt convention, a simple, text-based way for project owners to communicate preferences to AI tools. It catalogs implementations across projects and platforms, helping maintain a shared understanding of how LLM-powered services should interact with code and documentation. The repository aims to standardize patterns for allowlists, denylists, attribution, rate expectations, and contact information, mirroring...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB