Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Internet
Web Scrapers
Search Results

Search Results for "git:/git.code.sf.net/p/docfetcher/code" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Windows 40
Linux 32
Mac 30
More...
BSD 20
ChromeOS 20
Desktop Operating Systems 1

Category

Internet 40
Software Development 5
Database 2
Formats and Protocols 2
Multimedia 2
Business 1
Desktop Environment 1
Scientific/Engineering 1

License

OSI-Approved Open Source 30
Other License 2
Creative Commons Attribution License 1

Translations

English 8
German 1

Programming Language

Python 14
Java 6
JavaScript 5
C++ 4
More...
PHP 3
Go 2
Perl 2
TypeScript 2
Unix Shell 2
ASP 1
C 1
Ruby 1

Status

Beta 5
Production/Stable 4
Alpha 2
Inactive 1

Showing 40 open source projects for "git:/git.code.sf.net/p/docfetcher/code"

View related business solutions

Web Scrapers Windows Clear Filters & Widen Search

Jscrambler: Pioneering Client-Side Protection Platform
Jscrambler offers an exclusive blend of cutting-edge first-party JavaScript obfuscation and state-of-the-art third-party tag protection.

Jscrambler is the leader in Client-Side Protection and Compliance. We were the first to merge advanced polymorphic JavaScript obfuscation with fine-grained third-party tag protection in a unified Client-Side Protection and Compliance Platform. Our integrated solution ensures a robust defense against current and emerging client-side cyber threats, data leaks, and IP theft, empowering software development and digital teams to innovate securely. With Jscrambler, businesses adopt a unified, future-proof client-side security policy all while achieving compliance with emerging security standards including PCI DSS v4.0. Trusted by digital leaders worldwide, Jscrambler gives businesses the freedom to innovate securely.

Learn More
ThreatLocker Cybersecurity Software
Giving you complete control to help you manage your applications and better protect your endpoints.

The Zero Trust security solution that offers a unified approach to protecting users, devices, and networks against the exploitation of zero day vulnerabilities.

Learn More
1

ruia

Async Python framework for fast and flexible web scraping spiders

...Ruia is powered by Python’s asyncio library along with aiohttp, enabling developers to perform concurrent network requests efficiently and scrape data from websites with minimal overhead. Ruia follows a “write less, run faster” philosophy, emphasizing concise code and streamlined spider development. It provides a structured approach to building scraping projects through components such as data items, spiders, middleware, and plugins. Developers can define structured fields to extract information from HTML content and process responses asynchronously to improve crawling performance. It also supports middleware and plugin systems that allow customization of request handling, response processing, and additional functionality.

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
2

MusicalGalaxy

Shows the complex connection between musicians and their pupils

...The brightest star at the moment is Nadia Boulanger, who single-handedly changed the face of modern music, teaching musicians by the likes of Philip Glass, Daniel Barenboim and Aaron Copland. Credit to OrigamiDrag0n, 2020. MIT license reserved for all code. UPDATE - clicking the bubbles now opens the webpage of the chosen composer. Error messages thrown by this update are currently being investigated.

Downloads: 0 This Week

Last Update: 2020-07-15
See Project
3

Requests-HTML

Pythonic HTML Parsing for Humans

...Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing abilities, and async support. The rest of the code operates the same way as the synchronous version except that results is a list containing multiple response objects however the same basic processes can be applied as above to extract the data you want.

Downloads: 0 This Week

Last Update: 2023-04-10
See Project
4

GitGet

Ever wanted to download only a part of a Git repository.

Ever wanted to download only a part of a Git repository. Just paste the URL of the repo you want to download and sit back and enjoy. This simple java application makes use of Web Scraping and downloads only those files you need, thus helping you save your precious bandwidth and space.

1 Review

Downloads: 0 This Week

Last Update: 2018-09-03
See Project
Create and run cloud-based virtual machines.
Secure and customizable compute service that lets you create and run virtual machines.

Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.

Try for free
5

jd-autobuy

Python tool that automates JD.com login and product purchase tasks

...It uses web scraping and HTTP request techniques to log into an account, check product availability, and attempt to purchase specified items automatically. It supports login through methods such as QR code authentication, allowing users to sign in through the platform’s mobile application. Once authenticated, the script can retrieve product details including price, stock status, and item information. It can automatically add items to the shopping cart and prepare an order submission workflow for faster purchasing during high-demand sales or limited stock releases. ...

Downloads: 1 This Week

Last Update: 4 days ago
See Project
6

Gecco

Lightweight Java web crawler framework with jQuery-style extraction

...It integrates several well-known Java libraries and frameworks, including tools for HTTP requests, HTML parsing, JSON processing, and application development. Through its annotation-based design, developers can define crawling rules and data extraction logic directly within Java classes, reducing boilerplate code and improving readability. Gecco also provides mechanisms for handling dynamic web content, including support for asynchronous requests and extraction of JavaScript variables from pages. Gecco emphasizes extensibility and follows an open design that allows additional components and integrations to be added without modifying the core codebase.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
7

Simple-Scrape

Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.

Downloads: 0 This Week

Last Update: 2017-04-28
See Project
8

RoboBrowser

On the fly web scraper

RoboBrowser is a webkit powered browser which built for web scraping purposes. It loads requested webpage, saves page source to disk, and sends it's path to a php script as first parameter.

Downloads: 0 This Week

Last Update: 2016-09-18
See Project
9

webStraktor

...The webStraktor scripting language has a small instruction set and its syntax is easy to master. The standard webStraktor output format is XML based, either in ASCII, UTF-8 or ISO-8859-1 (Latin1) code pages. webStraktor relies on the Apache HttpClient for retrieving content via the HTTP protocol. It adheres to the Robots Exclusion Protocol and it can be configured to operate in an anonymous way by connecting to the predominant types of web proxy servers. webStraktor extends the functionality of web crawlers, spiders or bots by integrating scraping and crawling capabilities.

Downloads: 0 This Week

Last Update: 2014-04-25
See Project
Effortlessly Manage Product Information
OneTimePIM is a comprehensive Product Information Management System designed to streamline the import and distribution of product data.

A single source of truth for all of your product information with easy ways to distribute that data to wherever it needs to go, including the most powerful e-commerce connectors in the industry.

Learn More
10

Scra.php

Scrape anything!

The ultimate customiseable YAML-ised Web Scraper for PHP

Downloads: 0 This Week

Last Update: 2014-01-20
See Project
11

Folksonomy Web Crawler

A Web crawler prototype designed to index pages of certain resource sharing platforms based on folksonomy tags. The results are displayed in an Excel spreadsheet.

Downloads: 0 This Week

Last Update: 2015-02-08
See Project
12

Webtools 4 larbin

Larbin is a Web crawler intended to fetch a large number of Web pages, it should be able to fetch more than 100 millions pages on a standard PC with much u/d. This set of PHP and Perl scripts, called webtools4larbin, can handle the output of Larbin and p

Downloads: 0 This Week

Last Update: 2013-03-21
See Project
13

Arachnid Web Spider Framework

...It includes a simple HTML parser object that parses an input stream containing HTML content. Simple Web spiders can be created by sub-classing Arachnid and adding a few lines of code called after each page

Downloads: 1 This Week

Last Update: 2013-03-08
See Project
14

URL Web Crawler

It is basicly a program that can make you a search engine. It is a web crawler, has all the web site source code (in ASP, soon to be PHP as well), and a mysql database.

Downloads: 2 This Week

Last Update: 2015-05-23
See Project
15

Blackfire Player

Web Crawling, Web Testing, and Web Scraping application

...Some Blackfire Player use cases: Crawl a website/API and check expectations -- aka Acceptance Tests; Scrape a website/API and extract values; Monitor a website; Test code with unit test integration (PHPUnit, Behat, Codeception, ...); Test code behavior from the outside thanks to the native Blackfire Profiler integration -- aka Unit Tests from the HTTP layer (tm). Blackfire Player executes scenarios written in a special DSL (files should end with .bkf).

Downloads: 0 This Week

Last Update: 2019-06-11
See Project

Previous
1
You're on page 2
Next

Related Searches

web scraping

psy++

web spider, web crawler, email extractor

php scraper

larbin

arachnid

web crawler delphi

url search

Related Categories

Internet

Software Development

Database

Formats and Protocols

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise