Showing 415 open source projects for "convert pdf to txt"

View related business solutions
  • Captain Compliance - Data Privacy and Compliance Software Icon
    Captain Compliance - Data Privacy and Compliance Software

    Privacy Compliance Software - Avoid Fines and Prevent Lawsuits

    Captain Compliance handles your data privacy requirements so you can be privacy compliant. No more compliance stress, stop stressing over regulatory risks – just privacy protection managed by experts. Our user-friendly platform backed by privacy professionals simplifies the process of navigating regulations, giving your customers transparent choices, and building essential trust for your organization.
    Learn More
  • FusionAuth: Authentication and User Management Software Icon
    FusionAuth: Authentication and User Management Software

    Offer your users flexible authentication options, including passwords, passwordless, single sign-on (SSO), and multi-factor authentication (MFA).

    FusionAuth adds login, registration, SSO, MFA, and a bazillion other features to your app in days - not months.
    Learn More
  • 1
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 37 This Week
    Last Update:
    See Project
  • 2
    Markdown to PDF

    Markdown to PDF

    Hackable CLI tool for converting Markdown files to PDF using Node.js

    A simple and hackable CLI tool for converting markdown to pdf. It uses Marked to convert markdown to HTML and Puppeteer (headless Chromium) to further convert the HTML to PDF. It also uses highlight.js for code highlighting. The whole source code of this tool is only ~250 lines of JS ~500 lines of Typescript and ~100 lines of CSS, so it is easy to clone and customize.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Asciidoctor PDF

    Asciidoctor PDF

    Asciidoctor PDF: A native PDF converter for AsciiDoc

    A fast text processor & publishing toolchain for converting AsciiDoc to HTML5, DocBook & more. Asciidoctor is a fast, open source, Ruby-based text processor for parsing AsciiDoc® into a document model and converting it to output formats such as HTML 5, DocBook 5, manual pages, PDF, EPUB 3, and other formats. Asciidoctor also has an ecosystem of extensions, converters, build plugins, and tools to help you author and publish content written in AsciiDoc.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Create and run cloud-based virtual machines. Icon
    Create and run cloud-based virtual machines.

    Secure and customizable compute service that lets you create and run virtual machines.

    Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
    Try for free
  • 5
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx installed in order to run.
    Downloads: 208 This Week
    Last Update:
    See Project
  • 6
    tableExport.jquery.plugin

    tableExport.jquery.plugin

    jQuery plugin to export a html table to JSON, XML, CSV, TSV, TXT, SQL

    jQuery plugin to export an html table to JSON, XML, CSV, TSV, TXT, SQL, Word, Excel, PNG, and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    ...We offer a simple and concise API Reference and Guide as well as API Libraries with their own docs too. Our infrastructure uses the best PDF technology for processing PDF files. Merge and split documents with a variety of custom options. Remove, extract or organize PDF pages as you need. Reduce the size of your PDF while maintaining its original quality and formatting. Easily convert Images, MS Word, PowerPoint and Excel files into non-editable PDF documents. Convert PDF documents to JPG images or to PDF/A format.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    BentoPDF

    BentoPDF

    A Privacy First PDF Toolkit

    BentoPDF is a self-hosted, open-source PDF toolkit that provides a suite of local PDF manipulation features for users who want full control over their documents without relying on cloud PDF services. It offers functionality to merge, split, compress, rotate, and convert PDFs through an easy-to-deploy container or local installation, making it ideal for individuals and teams that handle large volumes of PDF files regularly.
    Downloads: 51 This Week
    Last Update:
    See Project
  • 9
    Pandoc

    Pandoc

    The universal markup converter

    Pandoc is a universal document converter able to convert files from a multitude of markup formats into another. With Pandoc, you have a swiss-army knife of a converter, able to convert practically any markup format into any other. Pandoc contains a Haskell library for conversions as well as a command-line tool that uses this library. It can convert to and from just about anything-- lightweight markup formats, HTML formats, documentation formats, ebooks, TeX formats, word processor formats...
    Downloads: 260 This Week
    Last Update:
    See Project
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • 10
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    PDFPatcher

    PDFPatcher

    A versatile toolkit for PDF manipulation

    PDFPatcher (aka “PDF补丁丁”) is a versatile toolkit for PDF manipulation—editing document metadata, bookmarks, page layout, content restrictions, rotation, compression, merging/splitting, image extraction, and more, all within an intuitive interface. Merge/split PDFs or images, preserve or add bookmarks, and set page dimensions. Batch style/color/target changes, regex/XPath search/replace, mid‑page positioning. Modify PDF metadata, page numbers, links, initial view mode, and remove open actions.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 12
    PyMuPDF

    PyMuPDF

    Python bindings for MuPDF's rendering library.

    ...The viewer is small, fast, yet complete. It supports many document formats, such as PDF, XPS, OpenXPS, CBZ, EPUB, and FictionBook 2. You can annotate PDF documents and fill out forms with the mobile viewers (this feature is coming soon to the desktop viewer as well). The command line tools allow you to annotate, edit, and convert documents to other formats such as HTML, SVG, PDF, and CBZ. You can also write scripts to manipulate documents using Javascript. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 13
    Spatie Browsershot

    Spatie Browsershot

    Convert HTML to an image, PDF or string

    Browsershot is a PHP package that allows developers to convert web pages into images or PDFs. It utilizes headless Chrome to render pages accurately and can be used to capture screenshots, generate PDFs, and manipulate web content programmatically. It is especially useful for applications that require generating visual content from dynamic or static web pages.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    image-to-pdf

    image-to-pdf

    Image to PDF converter is a small and best app, use it for free

    This is a desktop application used to convert the images into one single pdf . The application is easy to used , simply download and run the .exe file and browse the multiple images and select it . After that enter the pdf name and click convert, then it stores the pdf file into that folder where you have selected the images.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    PDFSticher

    PDFSticher

    Code repository for PDFStitcher, a utility to stitch together PDFs

    The open source PDF stitching software for sewists, by sewists. PDFSticher is a utility for stitching together many PDF pages from one document into a single page. This is also called "N-Up" or page imposition. This program was created in order to convert sewing patterns into a convenient format for projecting, though it could be used to stitch together any PDF.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 16
    Eisvogel

    Eisvogel

    A pandoc LaTeX template to convert markdown files to PDF or LaTeX

    A clean pandoc LaTeX template to convert your markdown files to PDF or LaTeX. It is designed for lecture notes and exercises with a focus on computer science. The template is compatible with Pandoc 3. Alternatively, if you don't want to install LaTeX, you can use the Docker image named pandoc/extra. The image contains pandoc, LaTeX, and a curated selection of components such as the eisvogel template, pandoc filters, and open source fonts.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Stirling-PDF

    Stirling-PDF

    #1 Locally hosted web application that allows you to work on PDFs

    This is a robust, locally hosted web-based PDF manipulation tool using Docker. It enables you to carry out various operations on PDF files, including splitting, merging, converting, reorganizing, adding images, rotating, compressing, and more. This locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements. Stirling PDF does not initiate any outbound calls for record-keeping or tracking purposes. All files and PDFs...
    Leader badge
    Downloads: 92 This Week
    Last Update:
    See Project
  • 18
    File Converter

    File Converter

    Simple tool which allows you to convert and compress files

    File Converter is a minimalist open‑source tool (GPL‑3.0) that lets users convert and compress one or multiple files directly via the Windows Explorer context menu. It integrates with powerful back-end utilities—FFmpeg, ImageMagick, Ghostscript—to handle a broad range of media and document transformations. File Converter is a personal open source project started in 2014. I have put hundreds of hours adding, refining and tuning File Converter with the goal of making the conversion and...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 19
    HDoujin Downloader

    HDoujin Downloader

    An easy-to-use manga and dōjinshi downloader supporting 800+ webistes

    HDoujin Downloader is a manga and dōjinshi download manager supporting 800+ websites across many different languages.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 20
    DOCX Document Converter

    DOCX Document Converter

    Convert .docx to .md/.txt and .html. Free, unlimited, fast.

    A simple, free, unlimited, secure web-based tool that converts Microsoft Word documents (.docx) into Markdown (.md/.txt) and HTML files. Perfect for developers, writers, and anyone who needs to transform .docx MS Office Word documents into web-friendly or AI context friendly formats. Unlike those other jerks on the web that charge many dollars per month for this, I made it free, unlimited and open source. This is a better version of 'convert docx to txt' since .md files can be opened in notepad++ just the same AND they preserve formatting too! ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any kind of framework. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    PDF-utility

    PDF-utility

    PDF Utility is a tool designed to efficiently manipulate PDF files

    Digna PDF Utility is a tool designed to efficiently manipulate PDF documents. It offers a range of functionalities including adding page numbers, deleting unwanted pages, merging multiple PDFs into a single file, converting PDF to DOCX and vice versa, protect a PDF file with password and displaying PDF content.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    Gotenberg

    Gotenberg

    A Docker-powered stateless API for PDF files

    Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice for converting numerous document formats (HTML, Markdown, Word, Excel, etc.) into PDF files, and more! Thanks to Docker, you don't have to install each tool in your environments; drop the Docker image in your stack, and you're good to go! The webhook feature allows you to upload the output file to the destination of your choice. There are many options to fit your requirements, from the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    KOReader

    KOReader

    An ebook reader application supporting PDF, DjVu, EPUB, FB2, etc.

    KOReader is a document viewer for E Ink devices. Supported fileformats include EPUB, PDF, DjVu, XPS, CBT, CBZ, FB2, PDB, TXT, HTML, RTF, CHM, DOC, MOBI and ZIP files. It’s available for Kindle, Kobo, PocketBook, Android and desktop Linux. Runs on embedded devices (Cervantes, Kindle, Kobo, PocketBook, reMarkable), Android and Linux computers. Developers can run a KOReader emulator in Linux and MacOS. Multi-lingual user interface with a highly customizable reader view and many typesetting options. ...
    Downloads: 126 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB