Award-winning proxy networks, AI-powered web scrapers, and business-ready datasets for download.
How the world collects public web data
Bright Data is a leading data collection platform, enabling businesses to collect crucial structured and unstructured data from millions of websites through our proprietary technology. Our proxy networks give you access to sophisticated target sites using precise geo-targeting. You can also use our tools to unblock tough target sites, accomplish SERP-specific data collection tasks, manage and optimize your proxy performance as well as automating all of your data collection needs.
Learn More
Cloud data warehouse to power your data-driven innovation
BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.
BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
Utility to validate, extract, show and create digital documents included in files with "pk7" and "fp7" extensions such as electronic bills provided by some phone companies in Spain like Movistar and Telefonica.
An all-in-one authentication with mysql as backend. Features: - Howto/Document - user info - libnss-mysql - pam-mysql - usersql - pdbsql (samba) - radius-mysql - mail
NZBGetter is a PHP Script for linux based systems to spider NZB index sites for NZB files matching your predefined search patterns. The script downloads matching NZB files and passes them to your Usenet Reader.
POESIA= Public Opensource Environment for a Safer Internet Access
an opensource Internet content filter (multimodal, mulitlingual) aimed for protection of youth (in schools...); partly funded by the European Commission
Flexible time and billing software that enables teams to easily track time and expenses for payroll, projects, and client billing.
Because time is money, and we understand how challenging it can be to keep track of employee hours. The constant reminder to log timesheets so your business can increase billables, run an accurate payroll and remove the guesswork from project estimates – we get it.
LaTeX Letterizer Project is a robust open source PDF document generator application for desktop environments. It uses the dinbrief class written by K.D. Braune and R. Gussmann to produce high quality letters. Visit our enhanced Website below.
Language teaching/learning aid. Laid generates latex vocabulary lists, double side printed memorization cards from arbitarary utf8 encoded latex text.
- latex based booklet generation
- latex based vocabulary lists
- latex based memorization cards
DIET-PC (DIskless Embedded Technology Personal Computer) is a software kitset enabling IT professionals to build an open source GUI appliance based on commodity x86 (PC), PowerPC (Mac) or ARM (handheld) hardware, using an embedded Linux methodology.
The project is an equivalent of a well known systems administration tool "cfengine". The aim of the project is to provide a safer and extensible framework for distributed system configuration management, using standard tools only.
TeXas assists the building process of LaTeX files and provides useful scripted features. It mainly acts as an automated build system. TeXbooklet creates booklets out of LaTeX files while TeXlayout creates a LaTeX file with a standard layout in it.
Planfix: Manage Projects, Team's Tasks and Business Processes
All-in-One Enterprise-Level Software is Now Available for SMB
Planfix is like a souped-up business process management system for folks who really know their stuff. It's built to help you dive deeper and gives you more options than your run-of-the-mill project and task management systems. Best part? Even small businesses and non-profits can get in on the action.
The Dublin Core Meta Toolkit transforms data collected via different methods into Dublin Core compatible meta data. The Toolkit is ideal for converting formats from Microsoft Access, MySQL and comma delimited value (CSV).
This project is designed to optimize search engine results by managing your web server sitemaps. The software combines both command line processes and a web user interface with a highly configurable architecture.
command-line utility to convert between all 183 formats supported by openoffice. Convert doc to html, html to png, etc. Requires OpenOffice but does not require any previously installed openoffice macros. Uses the python interpreter that is integrated in
The aim of MIEX (Metadata and Information Extractor from small XML documents) is to create a wrapper for the Stanford Parser, to extract and store metadata (syntactic structures, relationships among words...) from simple XML documents.
Unimatrix is the collective works of Matrix, Tsunami and Mainframe. Unimatrix allows for a setup of distributed file repositories. Completely peerless, and using HTTP/1.0 as its transfer medium. Unimatrix Can easilly be used for Filesharing and since its
A Bourne shell script which gives your Debian Linux computer the ability to read and install Windows programs, mount HFS and HFS+ formatted volumes, LSB, Red Hat, Stampede, Slackware Packages, and most (if not all) of the media codecs out there.
BibCollect is a free utility for the automatic download of BibTex entries from various public databases. It searches for citations in Latex files and automatically assembles an appropriate BibTex file.
This is a set of tools to work with Java classes. It resembles certain Unix commands that work with ELF -- like ldd, nm, dump etc. and provide class disassembly capability.
jRouter is a Web-based Linux router management system. It's designed to be a simple all-in-one router setup and management utility. Allows configuration of network interfaces, dhcpd, iptables, port forwarding, IP/MAC address filters.
POPsearch is a desktop search engine that's designed to help you find
information on your computer. This information can then be accessed remotely with RSS feeds, email feeds, or from any computer that has a web browser.
PdfRipImage is a program to automatically extract images from PDF documents and convert them to a format of your choice (such as JPEG or TIFF). It runs on UNIX-like platforms and requires utilities from netpbm and xpdf.
This project provides market data (Stock Ticker) tools for ticker data from various US based Equity and Options feed providers. This includes Feeds from NYSE, AMEX, and NASDAQ. The tools are for storing a copy, replaying and detecting gaps in the feeds.