Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "web crawler" - Page 3

x

Sort By:

Relevance

OS

Linux 178
Windows 172
Mac 150
More...
BSD 112
ChromeOS 95
Desktop Operating Systems 3
Server Operating Systems 3
Mobile Operating Systems 1

Category

Internet 170
Software Development 22
System 19
Security 16
Communications 9
Scientific/Engineering 8
Database 7
Business 6
Formats and Protocols 5
Artificial Intelligence 4
Desktop Environment 2
Education 2
Games 1
Mobile 1
Multimedia 1
Social sciences 1

License

OSI-Approved Open Source 157
Other License 6
Public Domain 2
Creative Commons Attribution License 1

Translations

English 47
German 7
French 5
Russian 5
More...
Chinese (Simplified) 2
Italian 2
Spanish 2
Brazilian Portuguese 1
Dutch 1
Esperanto 1
Hindi 1
Panjabi 1
Polish 1
Portuguese 1

Programming Language

Python 50
Java 47
PHP 35
JavaScript 24
More...
C++ 16
Perl 12
C# 10
C 9
Go 9
TypeScript 4
Unix Shell 4
ASP 2
PL/SQL 2
PowerShell 2
Visual Basic .NET 2
JSP 1
Kotlin 1
Ruby 1
Rust 1
Scala 1
Visual Basic 1

Status

Production/Stable 40
Beta 25
Alpha 20
Pre-Alpha 18
More...
Planning 10
Mature 1
Inactive 1

Showing 221 open source projects for "web crawler"

View related business solutions

No-code automation to improve your process workflows
Pipefy is a digital automation software that centralizes data and standardizes workflows for teams like Finance and HR

Transform your financial and HR operations and improve efficiency even remotely with digital, customized workflows that your team can automate and integrate with other software without the need of IT development.

Try For Free
Haystack is a modern, engaging, and intuitive intranet platform that employees actually use.
You Deserve the Best Intranet Experience

With customizable iOS and Android mobile apps, Slack and Microsoft Teams integrations, and an intuitive design employees love, Haystack brings an outstanding digital employee experience to your entire workforce, no matter where their work takes them.

Learn More
1

crawlergo

Headless Chrome crawler for collecting URLs for vulnerability scans

crawlergo is a browser-based web crawler designed to collect URLs and request data that can be used by web vulnerability scanning tools. It uses a Chrome headless environment to render web pages and observe behavior during the DOM rendering stage in order to capture as many accessible endpoints as possible. By monitoring the page lifecycle and interacting with web elements, the crawler automatically triggers JavaScript events and navigational actions that would normally occur during real user interaction. ...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
2

grab-site

Web crawler for archiving and backing up sites into WARC archives

grab-site is an open source web crawling tool designed to archive and back up websites by recursively downloading their content. It works by taking a starting URL and systematically following links across the site, capturing pages and resources and saving them into WARC archive files for long-term preservation. Internally, the crawler uses a fork of the wpull engine to fetch and process web pages efficiently during large-scale crawls. grab-site includes a built-in dashboard that displays real-time crawl activity, including which URLs are currently being processed and how many remain in the queue. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
3

DecryptLogin

Python library providing APIs for automated website login workflows

DecryptLogin is a Python library designed to simplify automated login processes for many popular websites by providing ready-to-use APIs that simulate authentication behavior. It focuses on implementing login mechanisms through HTTP requests, allowing developers to programmatically authenticate with supported services without manually replicating complex login flows. It includes modules that handle different authentication modes such as PC login, mobile login, and QR code login depending on...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
4

Hakrawler

Fast Go web crawler for discovering URLs and web app endpoints

hakrawler is a lightweight command-line web crawler built in Go that is designed to quickly discover URLs, endpoints, and assets within web applications. It is primarily used during the reconnaissance phase of security testing, bug bounty hunting, and penetration testing. It works by automatically crawling web pages and extracting links, JavaScript file locations, and other resources that may reveal additional attack surface or hidden functionality. hakrawler is implemented as a simple and efficient crawler using the Gocolly library, which allows it to perform fast and concurrent crawling of web pages. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
Stigg | SaaS Monetization and Entitlements API
For developers in need of a tool to launch pricing plans faster and build better buying experiences

A monetization platform is a standalone middleware that sits between your application and your business applications, as part of the modern enterprise billing stack. Stigg unifies all the APIs and abstractions billing and platform engineers had to build and maintain in-house otherwise. Acting as your centralized source of truth, with a highly scalable and flexible entitlements management, rolling out any pricing and packaging change is now a self-service, risk-free, exercise.

Learn More
5

pspider

Simple Python framework for building multithreaded web crawlers

PSpider is a lightweight web crawling framework written in Python designed to simplify the development of custom web spiders. It focuses on providing an easy-to-understand architecture while still supporting concurrent crawling for improved performance. It uses a multithreaded model that separates the crawling workflow into several components responsible for fetching, parsing, and saving data.

Downloads: 1 This Week

Last Update: 1 day ago
See Project
6

instagram-profilecrawl

Instagram profile crawler that extracts posts, tags, and stats

...It also provides scripts for downloading images from crawled profiles and logging statistics into CSV files for tracking metrics like followers, likes, and comments. Authentication is optional, meaning the crawler can access public profile data without logging in.

Downloads: 3 This Week

Last Update: 1 day ago
See Project
7

appcrawler

Automated mobile app crawler and testing tool built on Appium

AppCrawler is an automated mobile application testing tool designed to explore and interact with app user interfaces automatically. Built on top of the Appium automation framework, it systematically crawls through application screens and performs actions such as clicking buttons, navigating menus, and interacting with UI elements to simulate user behavior. It is commonly used for automated functional testing, UI exploration, and detecting crashes or unexpected behaviors in mobile...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
8

Abdal Web Traffic Generator

create useful statistics and traffic on your site

This tool will have the ability to create useful statistics and traffic on your site and actually help rank your statistics on sites like Alexa and so on.

1 Review

Downloads: 1 This Week

Last Update: 2021-12-05
See Project
9

ReconSpider

Most Advanced Open Source Intelligence (OSINT) Framework

...Reconnaissance is a mission to obtain information by various detection methods, about the activities and resources of an enemy or potential enemy, or geographic characteristics of a particular area. A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web, typically for the purpose of Web indexing (web spidering).

Downloads: 8 This Week

Last Update: 2022-11-25
See Project
EasySend is a no-code platform that transforms customer journeys
Defy form limits.  Create digital experiences.

Evolve forms into smart, AI-powered digital workflows that streamline your data intake and elevate customer experiences.

Learn More
10

Abot

Fast and flexible C# framework for building customizable web crawlers

Abot is an open source C# web crawler framework designed to help developers efficiently crawl and process web content. It focuses on speed, flexibility, and extensibility while handling the complex low-level tasks involved in web crawling. It manages essential components such as multithreading, HTTP requests, scheduling, and link parsing so developers can focus on processing the collected data.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
11

gocrawl

Polite concurrent web crawler library for Go with flexible hooks

gocrawl is a lightweight web crawling library written in the Go programming language that enables developers to build custom web crawlers and data extraction tools. gocrawl focuses on providing a minimal yet powerful crawling engine that can be easily extended and adapted for different web scraping or indexing tasks. It is designed to be polite when accessing websites by respecting crawling rules such as robots.txt policies and applying crawl delays for each host. It executes requests...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
12

lxspider

Educational Python web scraping case collection for many sites

lxSpider is a collection of web scraping examples designed primarily for learning and experimentation with data extraction techniques. It gathers numerous crawler implementations that demonstrate how to collect data from a wide range of websites and online services. It focuses heavily on practical cases that illustrate how different platforms handle requests, authentication parameters, and anti-scraping protections. lxSpider includes examples targeting areas such as e-commerce platforms, social media services, content sites, research databases, and information portals. ...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
13

CEF Python

Python bindings for the Chromium Embedded Framework (CEF)

Python bindings for the Chromium Embedded Framework (CEF). CEF Python is an open source project founded by Czarek Tomczak in 2012 to provide Python bindings for the Chromium Embedded Framework (CEF). The Chromium project focuses mainly on Google Chrome application development while CEF focuses on facilitating embedded browser use cases in third-party applications. Lots of applications use CEF control, there are more than 100 million CEF instances installed around the world. There are...

Downloads: 3 This Week

Last Update: 2022-05-03
See Project
14

Mowglee

Mowglee - The Geo Crawler!

Mowglee is a distributed, multi-threaded, asynchronous task execution based web crawler in Java.It is designed for geographic affinity and is highly modular.

Downloads: 0 This Week

Last Update: 2021-02-18
See Project
15

NY Times

A Simple Demonstration of the New York Times App

NY Times is a Minimal News 🗞 Android application built to describe the use of JSoup with Modern Android development tools.

Downloads: 0 This Week

Last Update: 2024-02-22
See Project
16

proxypool

Proxy crawler that aggregates, tests, and serves usable proxy nodes

...The behavior of the crawler and the sources it scans can be configured through configuration files, enabling users to customize how nodes are gathered and maintained. It also supports scheduled crawling to continuously update the proxy list and keep the pool current with newly discovered nodes.

Downloads: 17 This Week

Last Update: 2026-03-10
See Project
17

RED HAWK

All-in-one reconnaissance and vulnerability scanning toolkit for sites

RED HAWK is an open source command-line security tool designed for information gathering, vulnerability scanning, and web reconnaissance tasks. It combines multiple scanning and analysis capabilities into a single toolkit to help security researchers and penetration testers quickly analyze a target website. It can collect a wide range of information about domains, servers, and web applications, including network details, hosting configuration, and content management system detection. It also...

Downloads: 2 This Week

Last Update: 2 days ago
See Project
18

PHP mini vulnerability suite

Multiple server/webapp vulnerability scanner

github: https://github.com/samedog/phpmvs

Downloads: 0 This Week

Last Update: 2020-10-07
See Project
19

magnetW

Magnet link aggregation search

magnetW is based on the rule principle of magnetX , the search results of each magnetic station are uniformly formatted. There is no group in this project, only Github for code hosting and related technical exchanges, and other addresses may be risky, please distinguish carefully. This project is open source and free. There are no collection channels of any kind, such as donations, and no advertising of any kind. If you encounter anything similar to the above situation, please don't believe...

Downloads: 1 This Week

Last Update: 2021-05-31
See Project
20

WebSploit Framework

WebSploit is a high level MITM Framework

WebSploit Advanced MITM Framework [+]Autopwn - Used From Metasploit For Scan and Exploit Target Service [+]wmap - Scan,Crawler Target Used From Metasploit wmap plugin [+]format infector - inject reverse & bind payload into file format [+]phpmyadmin Scanner [+]CloudFlare resolver [+]LFI Bypasser [+]Apache Users Scanner [+]Dir Bruter [+]admin finder [+]MLITM Attack - Man Left In The Middle, XSS Phishing Attacks [+]MITM - Man In The Middle Attack [+]Java Applet Attack [+]MFOD Attack Vector [+]ARP Dos Attack [+]Web Killer Attack [+]Fake Update Attack [+]Fake Access point Attack [+]Wifi Honeypot [+]Wifi Jammer [+]Wifi Dos [+]Wifi Mass De-Authentication Attack [+]Bluetooth POD Attack Project In Github : https://github.com/websploit

Downloads: 9 This Week

Last Update: 2020-01-21
See Project
21

BotSlayer

BotSlayer Community Edition

BotSlayer is an application that helps track and detect potential manipulation of information spreading on Twitter. The tool is developed by the Observatory on Social Media at Indiana University --- the same lab that brought to you Botometer and Hoaxy. BotSlayer is not a tool to detect and remove likely social bots from your list of Twitter followers or friends. For that purpose, check out Botometer. If you just want to visualize the spread of some piece of information, consider Hoaxy....

Downloads: 0 This Week

Last Update: 2023-07-13
See Project
22

ECommerceCrawlers

Collection of Python ecommerce and website crawler examples projects

ECommerceCrawlers is a collection of practical Python web crawler projects designed to gather data from a variety of ecommerce platforms, websites, and online services. It aggregates many independent crawler examples created by contributors and organized into separate subprojects that target specific sites or data sources. These examples demonstrate how to build and operate web scrapers capable of collecting structured information such as product listings, news content, job postings, social media data, and other publicly available web data. ...

Downloads: 9 This Week

Last Update: 1 day ago
See Project
23

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...

Downloads: 0 This Week

Last Update: 2021-10-05
See Project
24

Photon

Incredibly fast crawler designed for OSINT

Photon is an extremely fast web crawler built specifically for OSINT and reconnaissance use cases. It is designed to extract URLs, endpoints, files, and other intelligence artifacts from target websites with minimal overhead. The crawler prioritizes speed and breadth, making it suitable for mapping web attack surfaces and discovering hidden resources. Photon is commonly used during early reconnaissance phases to build a comprehensive inventory of reachable assets. ...

Downloads: 6 This Week

Last Update: 2026-03-03
See Project
25

ShadowSocksShare

Python ShadowSocks framework

This project obtains the shared ss(r) account from the ss(r) shared website crawler, redistributes the account and generates a subscription link by parsing and verifying the account connectivity. Since Google plus will be closed on April 2, 2019, almost all the available accounts crawled before come from Google plus. So if you are building your own website, please keep an eye on the updates of this project and redeploy using the latest source code.

Downloads: 0 This Week

Last Update: 2022-11-09
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

traffic generator

osint

algorithmic trading python

magnetw

counter intelligence

admin finder

abdal

wifi jammer

cain and able

auto traffic

Related Categories

Internet

Software Development

System

Security

Communications

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise