Page 6 | git:/git.code.sf.net/p/docfetcher/code free download

gpt-oss-120b

OpenAI’s open-weight 120B model optimized for reasoning and tooling

...Developers can control the reasoning level (low, medium, high) to balance speed and depth depending on the task. Released under the Apache 2.0 license, it enables both commercial and research applications. The model supports function calling, web browsing, and code execution, streamlining intelligent agent development.

Downloads: 0 This Week

Last Update: 2025-08-05

See Project

DeepSeek-V3.1-Terminus

685B model with improved agents and consistency

...It improves language consistency, reducing mixed Chinese-English outputs and eliminating abnormal characters, enhancing reliability in multilingual scenarios. The update also refines agentic capabilities, especially for the Code Agent and Search Agent, leading to better tool integration and query handling. Benchmarks show small but notable gains, such as raising MMLU-Pro from 84.8 to 85.0, GPQA-Diamond from 80.1 to 80.7, and SWE Verified from 66.0 to 68.4, along with significant improvements in agent benchmarks like BrowseComp (30.0 → 38.5) and Terminal-bench (31.3 → 36.7). ...

Downloads: 0 This Week

Last Update: 2025-09-24

See Project

BLEURT-20-D12

Custom BLEURT model for evaluating text similarity using PyTorch

...Unlike standard BLEURT models from TensorFlow, this version is built from a custom PyTorch transformer library. It requires installing the model-specific library from GitHub to function properly. Once set up, it can be used to compute similarity scores with minimal code. BLEURT-20-D12 enables more flexible deployment in PyTorch-based workflows for evaluating language generation outputs.

Downloads: 0 This Week

Last Update: 2025-07-02

See Project

Hermes 4

Hermes 4 FP8: hybrid reasoning Llama-3.1-405B model by Nous Research

...It introduces a hybrid reasoning mode with explicit <think> segments, enabling the model to deliberate deeply when needed and switch to faster responses when desired. Post-training improvements include a vastly expanded corpus with ~60B tokens, boosting performance across math, code, STEM, logic, creativity, and structured outputs. The model is designed for schema adherence, producing valid JSON and repairing malformed outputs, making it highly suitable for tool use and function calling. Hermes 4 is engineered for superior steerability with reduced refusal rates, aligning responses to user values while preserving assistant quality. ...

Downloads: 0 This Week

Last Update: 2025-09-01

See Project

wav2vec2-large-xlsr-53-portuguese

Portuguese ASR model fine-tuned on XLSR-53 for 16kHz audio input

wav2vec2-large-xlsr-53-portuguese is an automatic speech recognition (ASR) model fine-tuned on Portuguese using the Common Voice 6.1 dataset. It is based on Facebook’s wav2vec2-large-xlsr-53, a multilingual self-supervised learning model, and is optimized to transcribe Portuguese speech sampled at 16kHz. The model performs well without a language model, though adding one can improve word error rate (WER) and character error rate (CER). It achieves a WER of 11.3% (or 9.01% with LM) on Common...

Downloads: 0 This Week

Last Update: 2025-07-01

See Project

GigaChat 3 Ultra

High-performance MoE model with MLA, MTP, and multilingual reasoning

...The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its training corpus incorporates ten languages, enriched with books, academic sources, code datasets, mathematical tasks, and more than 5.5 trillion tokens of high-quality synthetic data. This combination significantly boosts reasoning, coding, and multilingual performance across modern benchmarks. Designed for high-performance deployment, GigaChat 3 Ultra supports major inference engines and offers optimized BF16 and FP8 execution paths for cluster-grade hardware.

Downloads: 0 This Week

Last Update: 2025-12-03

See Project

OpenVLA 7B

Vision-language-action model for robot control via images and text

OpenVLA 7B is a multimodal vision-language-action model trained on 970,000 robot manipulation episodes from the Open X-Embodiment dataset. It takes camera images and natural language instructions as input and outputs normalized 7-DoF robot actions, enabling control of multiple robot types across various domains. Built on top of LLaMA-2 and DINOv2/SigLIP visual backbones, it allows both zero-shot inference for known robot setups and parameter-efficient fine-tuning for new domains. The model...

Downloads: 0 This Week

Last Update: 2025-07-23

See Project

Llama-3.2-1B-Instruct

Instruction-tuned 1.2B LLM for multilingual text generation by Meta

Llama-3.2-1B-Instruct is Meta’s multilingual, instruction-tuned large language model with 1.24 billion parameters, optimized for dialogue, summarization, and retrieval tasks. It builds upon the Llama 3.1 architecture and incorporates fine-tuning techniques like SFT, DPO, and quantization-aware training for improved alignment, efficiency, and safety. The model supports eight primary languages (including English, Spanish, Hindi, and Thai) and was trained on a curated mix of publicly available...

Downloads: 0 This Week

Last Update: 2025-07-02

See Project

Search Results for "git:/git.code.sf.net/p/docfetcher/code" - Page 6

Showing 133 open source projects for "git:/git.code.sf.net/p/docfetcher/code"

gpt-oss-120b

DeepSeek-V3.1-Terminus

BLEURT-20-D12

Hermes 4

wav2vec2-large-xlsr-53-portuguese

GigaChat 3 Ultra

OpenVLA 7B

Llama-3.2-1B-Instruct

Search Results for "git:/git.code.sf.net/p/docfetcher/code" - Page 6

Showing 133 open source projects for "git:/git.code.sf.net/p/docfetcher/code"

gpt-oss-120b

DeepSeek-V3.1-Terminus

BLEURT-20-D12

Hermes 4

wav2vec2-large-xlsr-53-portuguese

GigaChat 3 Ultra

OpenVLA 7B

Llama-3.2-1B-Instruct

Related Categories