Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Internet
Web Scrapers
Search Results

Search Results for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files" - Page 5

x

Sort By:

Relevance

Clear All Filters

OS

Windows 134
Linux 121
Mac 117
More...
BSD 74
ChromeOS 71
Mobile Operating Systems 2
Server Operating Systems 2

Category

Internet 134
Software Development 14
Formats and Protocols 10
Business 5
System 4
Communications 3
Multimedia 3
Scientific/Engineering 3
Artificial Intelligence 2
Desktop Environment 2
Security 2
Database 1
Education 1
Terminals 1

License

OSI-Approved Open Source 111
Other License 7

Translations

English 14
German 2

Programming Language

Python 58
Java 24
JavaScript 17
Go 13
More...
PHP 11
TypeScript 8
Unix Shell 8
C# 7
C++ 3
C 2
Perl 2
PowerShell 2
Rust 2
Visual Basic .NET 2
Elixir 1
PL/SQL 1
R 1
Ruby 1
Scala 1
XSL (XSLT/XPath/XSL-FO) 1

Status

Production/Stable 15
Beta 14
Alpha 5
Pre-Alpha 3
More...
Planning 1

Showing 134 open source projects for "/storage/emulated/0/android/data/net.sourceforge.uiq3.fx603p/files"

View related business solutions

Web Scrapers Windows Clear Filters & Widen Search

Employees get more done with Rippling
Streamline your business with an all-in-one platform for HR, IT, payroll, and spend management.

Effortlessly manage the entire employee lifecycle, from hiring to benefits administration. Automate HR tasks, ensure compliance, and streamline approvals. Simplify IT with device management, software access, and compliance monitoring, all from one dashboard. Enjoy timely payroll, real-time financial visibility, and dynamic spend policies. Rippling empowers your business to save time, reduce costs, and enhance efficiency, allowing you to focus on growth. Experience the power of unified management with Rippling today.

Learn More
GWI: On-demand Consumer Research
For marketing agencies and media organizations requiring a solution to get consumer insights

Need easy access to consumer insights? Our intuitive platform is the answer. Get the ultra-reliable research that brands and agencies need to stay ahead of changing consumer behavior.

Learn More
1

WeChatSogou

Python library to crawl and retrieve data from WeChat accounts

WechatSogou is an open source Python library designed to retrieve data from WeChat official accounts by using the Sogou WeChat search service as its data source. It provides developers with a programmatic way to search for public accounts and collect article information without manually browsing the search interface. It functions as a crawler interface that sends requests to the search engine, retrieves results, and converts the returned pages into structured data that can be used in applications or analysis pipelines. ...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
2

pyspider

A powerful Spider(Web Crawler) system in Python

...Or using MySQL or MongoDB and RabbitMQ to deploy a distributed crawl cluster. To deploy pyspider in product environment, running component in each process and store data in database service is more reliable and flexible. To deploy pyspider components in each single processes, you need at least one database service. pyspider now supports MySQL, MongoDB and PostgreSQL. You can choose one of them.

Downloads: 0 This Week

Last Update: 2021-03-31
See Project
3

crawler4j

Open source web crawler for Java

...This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not. In the above example, this example is not allowing .css, .js and media files and only allows pages within ics domain. visit function is called after the content of a URL is downloaded successfully. You can easily get the url, text, links, html, and unique id of the downloaded page. You should also implement a controller class which specifies the seeds of the crawl, the folder in which intermediate crawl data should be stored and the number of concurrent threads.

Downloads: 0 This Week

Last Update: 2022-01-12
See Project
4

haipproxy

Distributed proxy IP pool for web crawlers using Scrapy and Redis

...It automatically crawls proxy resources from the internet and aggregates them into a centralized pool that can be accessed by distributed spiders and scraping systems. It is built using Python and relies on Scrapy for high-performance crawling while Redis is used for data storage, communication, and task coordination between components. It includes crawlers that discover proxy servers, validators that test proxy availability and performance, and schedulers that manage crawling and validation tasks. HAipproxy aims to maintain a high availability proxy pool with low latency so that scraping frameworks can rotate proxies efficiently and avoid blocking during large-scale data collection. ...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
Empower Your Contact Center with Human-Like AI Conversations
Deliver faster resolutions, lower costs, and better CX without hiring another agent.

Enterprise Bot, based in Switzerland, is a pioneer in Conversational AI, Process Automation, and Generative AI. With the trust of esteemed enterprise giants across industries like Generali, SIX, SBB, DHL, and SWICA, Enterprise Bot is revolutionizing both customer and employee experiences. Through its advanced integration with Large Language Models (LLM) such as ChatGPT and Llama 2, and its unique patent-pending DocBrain technology, the company delivers unparalleled personalization, active engagement, and omnichannel solutions across platforms like email, voice, and chat. Furthermore, Enterprise Bot integrates with existing core systems, such as SAP, CRMs, Confluence and more, and with its proprietary middleware, Blitzico, enables the AI to not only respond to queries but also take action to resolve them. This dedication to innovation in four main use case areas, Customer Support, Sales and Marketing, Knowledge Management and Digital Coworker, elevates both CX and employee productivity.

Learn More
5

gain

Asyncio-based Python framework for building fast web crawling spiders

...It is built on top of asynchronous technologies such as asyncio, aiohttp, and uvloop to support high-performance crawling with concurrent network requests. It provides a structured framework for creating spiders that can navigate websites, extract structured data, and process the collected results. Developers define crawlers using components such as spiders, parsers, and items, allowing them to organize crawling logic and data extraction rules clearly. Gain supports CSS selectors and XPath expressions for parsing page content and extracting specific elements. Gain also allows developers to configure headers, concurrency levels, and proxy settings to control how crawlers interact with target websites. ...

Downloads: 1 This Week

Last Update: 4 days ago
See Project
6

Toapi

Convert websites into structured APIs automatically with Python tool

Toapi is a Python library designed to transform ordinary websites into usable API services. Instead of building a traditional web crawler that collects and stores data before exposing it through an API, Toapi simplifies the process by allowing developers to define data structures that automatically generate an API layer from existing web pages. It works by parsing HTML content from a source site and mapping selected elements into structured data that can be returned as JSON through API endpoints. ...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
7

Gecco

Lightweight Java web crawler framework with jQuery-style extraction

...It integrates several well-known Java libraries and frameworks, including tools for HTTP requests, HTML parsing, JSON processing, and application development. Through its annotation-based design, developers can define crawling rules and data extraction logic directly within Java classes, reducing boilerplate code and improving readability. Gecco also provides mechanisms for handling dynamic web content, including support for asynchronous requests and extraction of JavaScript variables from pages. Gecco emphasizes extensibility and follows an open design that allows additional components and integrations to be added without modifying the core codebase.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
8

Perl Web Scraping Project

Perl Web Scraping Project

Web scraping (web harvesting or web data extraction) is data scraping used for extracting data from websites.[1] Web scraping software may access the World Wide Web directly using the Hypertext Transfer Protocol, or through a web browser. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler.

Downloads: 0 This Week

Last Update: 2017-10-12
See Project
9

DSTK - DataScience ToolKit

DSTK - DataScience ToolKit for All of Us

...Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling. DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy.

Downloads: 1 This Week

Last Update: 2018-05-08
See Project
No-code email and landing page creation
Make campaign creation fast and easy with Knak

Built for speed and collaboration, Knak streamlines campaign production with modular templates, real-time editing, simple collaboration, and seamless integrations with leading MAPs like Adobe Marketo Engage, Salesforce Marketing Cloud, Oracle Eloqua, and more. Whether you're supporting global teams or launching fast-turn campaigns, Knak helps you go from brief to build in minutes—not weeks. Say goodbye to bottlenecks and hello to marketing agility.

Learn More
10

JAWS - Just Another Web Scraper

A simple Web Scraper using Regular Expression or Html Agility

JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.

Downloads: 2 This Week

Last Update: 2018-03-30
See Project
11

Save For Offline

Android app for saving webpages for offline reading

Android app for saving webpages for offline reading. Save For Offline is an Android app for saving full web pages for offline reading, with lots of features and options. In you web browser selects 'Share', and then 'Save For Offline'. Saves real HTML files which can be opened in other apps/devices. Download & save entire web pages with all assets for offline reading & viewing.

Downloads: 1 This Week

Last Update: 2023-04-12
See Project
12

Simple-Scrape

Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.

Downloads: 0 This Week

Last Update: 2017-04-28
See Project
13

newsscrape

news headline collecting for analysis in determining the category

... - It extracts RSS feed from Google News. - Each news headline is matched against Google News category like Entertainment, Sports, etc. - Called from scheduler to collect this data at 5 minutes interval and be accumulated in a database. - It contains R statistical computing scripts to learn the pattern on words in the headline resulting a particular category. - To test its accuracy in predicting the category from a news headline, select a news title from other sources - e.g. http://rss.news.yahoo.com/rss/entertainment - and incorporate it into the R script for outputting a news category it assumes on the news title.

Downloads: 0 This Week

Last Update: 2016-07-17
See Project
14

webStraktor

webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax.

Downloads: 0 This Week

Last Update: 2014-04-25
See Project
15

xpider

An extensible web spider (crawler) for Joomla!

The extensible web spider (Xpider) is Joomla! component that tries to make the crawling of external webpages possible for you. It is possible to create a Spider and give it some Tasks (data to find) and some Seeds (web addresses) to search on. The Spider's Finding (the result of finding the tasks) is possible to link to a database.

Downloads: 0 This Week

Last Update: 2014-03-23
See Project
16

Constellio Enterprise Search engine

Open source Search Engine and Enterprise Search

Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.

Downloads: 1 This Week

Last Update: 2015-03-31
See Project
17

PGBuild

Compile your mobile web pages into mobile aps via build.phonegap.com

...The spider is controlled by a project file that sets the rules for the spider and the options for the phonebap build service. You may create and manage your phonegap project source files manually on your webserver or use PGBuild to connect to a CMS system to extract content. PGBuild is managed from a small widget that you may use your self or integrate into a CMS system.

Downloads: 0 This Week

Last Update: 2015-08-07
See Project
18

Heritrix: Internet Archive Web Crawler

The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.

21 Reviews

Downloads: 11 This Week

Last Update: 2013-06-05
See Project
19

xWebScraper

This is an advanced web scraper with user friendly GUI which let the user define rules and web addresses to extract data from one time or periodically and a target database filed that the data should be saved in.

Downloads: 0 This Week

Last Update: 2014-07-13
See Project
20

datalus

PHP web API designed to simplify object handling(loading, saving, querying, displaying, and editing), abstract the data from its display structure, and layout and allow the target data to be delivered to any supported format without special logic.

Downloads: 0 This Week

Last Update: 2016-05-28
See Project
21

ItSucks

This project is a java web spider (web crawler) with the ability to download (and resume) files. It is also highly customizable with regular expressions and download templates. All backend functionalities are also available in a separate library.

3 Reviews

Downloads: 3 This Week

Last Update: 2013-04-29
See Project
22

WebScraper - Web Data Extraction

A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line

1 Review

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
23

arachnode.net

arachnode.net is an open source Web crawler for downloading, indexing and storing Internet content including e-mail addresses, files, hyperlinks, images, and Web pages and is written in C# using SQL Server 2008. See http://arachnode.net for the LATEST.

1 Review

Downloads: 0 This Week

Last Update: 2014-06-25
See Project
24

BTV Rename

The goal of this project is 100% TVDB recognition and SxxExx renaming of all files generated by BeyondTV while maintaining the BeyondTV database. Currently the project relies on TVrage and scans the entire folder which is selected by the user in the conf

Downloads: 0 This Week

Last Update: 2015-05-01
See Project
25

Methabot Web Crawler

Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.

2 Reviews

Downloads: 0 This Week

Last Update: 2013-05-15
See Project

Previous
1
2
3
4
You're on page 5
6
Next

Related Searches

web crawler

lg bypass tool

web scraping

jasp

web scraper

web spider, web crawler, email extractor

burp

constellio

web spider

data scraper

Related Categories

Internet

Software Development

Formats and Protocols

Business

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Privacy Choices Advertise